[SciPy-dev] Some concerns on Scipy development

eric eric at scipy.org
Wed Mar 27 10:53:46 CST 2002


Hey Pearu,

>
> > I've have thought about this a little lately also.  There is a philosophical
> > difference to packaging among the scientific developers.  Some wish for
small
> > single purpose and stand alone packages that are installed one by one.
Others
> > wish for a single "standard library" of scientific tools that, once
installed,
> > is a one stop shop for a large number of scientific algorithms.  There are
> > benefits to both.  However, I come squarely down in the second camp.  A
> > monolithic package is easier to install for end users, and it solves
> > compatibility issues (such as SciPy changing the behavior of Numeric in some
> > places).  I believe the existence of such a package is required before there
can
> > be a mass conversion of engineers and scientist to Python as their tool of
> > choice for daily tasks.  This is the goal of SciPy.
>
> I have been in peace with this goal of SciPy for a long time. In my
> concers I was not trying to propose to change this general goal in any
> way. Instead, I was concerned on the internal structure of SciPy and to
> see if we could ease the SciPy development and make it more robust for the
> future.

Right.  I didn't think you were -- I just wanted to note the differnce of
opinions on this and explain where SciPy fit in the picture.

>
> One efficient way to achive that would be to require that internal modules
> in SciPy would be as independent as possible. A good measure for this
> independence is that a particular module can be installed as a standalone.

I agree and think your suggestion to move as far as possible this direction is
good. But, I also don't think dependence on a single package is to much of a
price to pay.  There is already some difference in scipy_base/scipy_lite
(whatever it is called) and Numeric's behavior.  We need to import this instead
of Numeric directly to insure current and future linalg, etc. modules comply
with the expected behavior in SciPy.  Also, scipy_base has many convenience
functions that will be helpful in other places.

> Note that I am not proposing this because I would like to use these
> modules as standalone modules myself (or any other party), but only to
> strengthen SciPy by making it more robust internally.

I'm actually am a consumer in this case. I'd would like to use modules outside
of SciPy on occasion, and want to make it as easy as possible within the SciPy
framework.  Witness weave.  It seems like the scipy_base concept accomplishes
this.  If your willing to inlcude Numeric as a requirement, adding scipy_base
shouldn't be an issue.

>
> By doing this, it does not mean that the main goal of SciPy is
> somehow threatened, it will be still a monolithic package for end-users.
> Just its internal structure will be modular and less sensitive to adding
> new modules or reviewing some if needed in future.

Again, I agree -- I think we are on the same page.

>
> Now about the question whether SciPy parts can be completely independent?
> I think this can be never achived in principle nor it is desired,
> but it is a good ideal to follow *whenever* it is possible (and not
> just a nice thing to do as you say) and, indeed,  can be practical for
> other projects, and all that for the sake of SciPy own success.
>
> <snip>
>
> > So, I think this is a worthy goal for *some* of the modules (notably the
ones
> > people are discussing such as integrate, linalg, etc), with one caveat.
These
> > modules need access to some functions provided by scipy and will need to
import
> > at least one extra module.  Scanning linalg, the needed functions are amax,
> > amin, triu, etc. and a handful of functions subsumed from Numeric as well as
> > some constants from scipy.limits.  I consider it a bad idea to replicate
these
> > functions across multiple modules because of the maintenance issues
associated
> > with duplicate code.  I don't want to go down that path.
>
> Me neither. However your statement that these modules necessarily need
> access to scipy functions, is a bit exaggerated.
> In general, there are several ways how the same functionality can be
> implemented, and it is my experience that linalg2 can be implemented
> without the scipy dependence and that also without replicating any
> code.

This may be the case.  Please let us know what you have in mind.  Travis has
implemented a lot of stuff that uses functions that are currently in scipy and
will be in scipy_lite.  The linalg interfaces to solve, expm, etc. may not
currently be the most efficient, but, by all reports, they are working pretty
well and address many problems.  I'm sure we will need to rework the interface
some -- I personally see the need for an lu_factor and lu_solve method that are
thinly layered over getrf and getrs for efficiency.  I'm sure there are other
places that linear algebra gurus could point out.  Waiting for the perfect
interface though, makes people like Jochen who is waiting on a (somewhat) stable
release continue to wait.  If the only problem is efficiency, I say we get a
release based on the current interface out there, and solve the efficiency
issues in the next release.

One other note.  I do not see the interface of a 0.2 package set in stone.
Users are considered "early adopters."  If there is good reason to change the
interface between 0.2 and 0.3 then we should do it.  When we get up in the .6 or
.7 range, then we should be more careful about changes.  But for now, like f2py,
the changes are OK.

Perhaps we should start a thread discussing the SciPy linear algebra interface.
Would this be helpful?

> In fact, using high-level scipy convinience functions in linalg2
> that is supposed to provide highly efficient and yet to be user-friendly
> (yes, both goals can achived at the same time!) algorithms, is not good
> because scipy functions just are inefficient due to their general
> purpose feature and the initial wins in performance are lost.

Some can be made efficient.  Some will be less so.  I'm more worried about
getting a working version out that (hopefully) can be made efficient in the
future than I am in optimizing it right now.

If we want to make changes to linalg, lets discuss specifics.

>
> Therefore low level modules like linalg, integrate, etc must be carefully
> implemented even if it takes more time and seemingly direct Python hooks
> could be applied.
>
> > So this is what the site-packages view of scipy would be:
> >
> >     site-packages
> >         scipy_distutils
> >         scipy_test
> >         scipy_level0
> >             subsumes and customizes Numeric
> >             handy.py
> >             misc.py
> >             scimath.py
> >             Matrix.py (?)
> >             fastumath.so (pyd)
> >             etc.
> >         scipy
> >             subsume scipy_base
> >             everything else
>
> This looks like a positive plan to me.
>
> Any other candidates for naming scipy_level0? It reflects too much
> the internals of SciPy but will contain very useful general purpose
> functions, I assume, to be useful more widely.
> How about scipy_base?

scipy_base is fine with me.

> Another idea would be then to move scipy_test inside scipy_base (and
> dropping its scipy_ prefix). Since scipy_base would be mostly pure Python,
> it should be feasible.

Good idea.  The current "packagization" of scipy_test was a complete hack to get
around limitations in distutils.  scipy_base is a much better home for it.

> (Later, be not surprised if I will question the naming of handy.py and
> misc.py, but I am not ready for that yet ...;-)

Funny you should mention that.  misc.py was my utility module.  handy.py was
Travis O.'s.  We both thought they should be merged into an appropriately named
module in the move to scipy_base.  Pick a name.

> > In regards to higher level modules that use fft, svd, and other complex
> > algorithms, they are just gonna have to import scipy.
>
> +2
>
> > This requires some discussion before we make the change.  It's also gonna
> > require
> > someone to step up and implement the change -- though it probably isn't a
major
> > effort.
>
> It may be a good idea to release 0.2 before such a change. If it works out
> nicely, then 0.3 could follow quickly.

We could do that.  I think the change isn't that difficult.  Travis O. has
already structured the code in a way that is pretty much equivalent to the
scipy_base idea.  His level0 functions/modules can be moved over into to
scipy_base plus fastumath, limits, scipy_test (others?).  Creating scipy_base
now solves the problem of where to put fastumath which doesn't have a good home.
The issue that needs more thought is the NaN functions.  They should also go
over there, but they are part of cephes, and the entire "special" package should
not be moved (I don't think...).  Needs the most thought.  After making the
scipy_base package, the find/replaces need to be done in appropriate modules.

I'd lean toward trying to get the scipy_base idea in this release.  If it looks
like to much disruption though, we'll push it to 0.3.

Perhaps April 5th is to ambitious to fit all this in.  I'd like to try though.

eric






More information about the Scipy-dev mailing list