[Numpy-discussion] Numeric3

Travis Oliphant oliphant at ee.byu.edu
Sat Feb 5 16:00:09 CST 2005


Peter Verveer wrote:

>
> What I find unfortunate is that due to the differences between the 
> packages, nd_image cannot be compiled for both Numeric and Numarray at 
> the moment. I did not forsee that the split between the packages would 
> exist for so long. I do however not agree that it is an example of 
> duplication of work. The N-D convolutions are based on a 
> filter-framework that is an essential part of nd_image, which would 
> have to be implemented anyway (e.g. the morphology operators are also 
> based on it.) So one should be too quick to judge if something is 
> duplication and a wast of resources.


This is exactly why I am dissatisfied with the way numarray has been 
advertised (I'm not blaming anyone here, I recognize the part that I 
played in it).  Third-party package providers starting to build on top 
of numarray before it was a clear replacement.   I don't see anything in 
nd_image that could not have sat on top of Numeric from the 
beginning.    I don't see why it needed to use numarray specific things 
at all.   By the way, nd_image it is a very nice piece of work that I 
have been admiring.  

> common standard is a very good idea. But right now I don't find SciPy 
> attractive as a framework, because 1) it is too big and not easily 
> installed. 2) it is not very well documented. Thus, I prefer to write 
> my package against a smaller code-base, in this case Numarray, but it 
> could have also been Numeric. That has the advantage that people can 
> install it more easily, while it still can be included in things like 
> SciPy if desired.

What is too big about it?  Which packages would you like to see not 
included?   Is it the dependence on Fortran that frightens, or the 
dependence on ATLAS (not a requirement by the way)?   How many realize 
that you can turn off installation of packages in the setup.py file by a 
simple switch?

The point is that nd_image should live perfectly well as a scipy package 
and be installed and distributed separately if that is desired as 
well.   But, it would present a uniform face to others looking to do 
scientific work with Python.  This appearance of multiple packages has 
only amplified the problem that scipy was trying to solve.  If scipy is 
solving it badly, then lets fix it.     I do appreciate the comments 
that are helping me understand at least what needs to be fixed.  But, 
everybody has always been welcome to come and help fix things --- it is 
quite easy to get CVS write access to the scipy core.

Matplotlib as well is reinventing stuff already available elsewhere.  
SciPy has always tried to bring a matlab-like environment to the user.  
That's always been my goal.  It is frustrating to see so many with the 
same goal work so independently of each other.    Sometimes I guess it 
is necessary so that new ideas can be tried out.   But, it sure seems to 
happen more often than I would like with Python and Science / Engineering.

Is it ownership that is a concern?  None of us involved with scipy have 
an alterior motive for trying to bring packages together except the 
creation of a single standard and infrastructure for scientific 
computing with Python.    I keep hearing that people like the idea of a 
small subset of packages, but that is exactly what scipy has always been 
trying to be.   Perhaps those involved with scipy have not voiced our 
intentions enough or loud enough. 

> I can only agree with that. Regardless if I want to use SciPy or not, 
> or in whatever form, I would like to see this problem go away, so that 
> my software can become available for everybody.

I'm glad to hear this. 

>> My very ambitious goal with Numeric3 is to replace both Numeric and 
>> Numarray (heavily borrowing code from each).  When I say replace 
>> them, I'm talking about the array object and umath module.  I'm not 
>> talking about the other packages that use them.
>
>
> I have to admit that I was very sceptical when I read your 
> announcement for Numeric3, since I thought it would not be a good idea 
> to have yet another array package. 

I've never seen it as yet another array package.  I've always seen it as 
a Numeric upgrade with the hope of getting closer to numarray (and even 
replace it if possible).    In fact, since I'm currently the maintainer 
of Numeric, those who use Numeric should perhaps be the most concerned 
because Numeric3 will replace Numeric (and will therefore try not to 
break old code too much --- though there may be a few changes).   I can 
guarantee that SciPy will work with Numeric3.   We've always told scipy 
users that if they use SciPy they won't have to worry about the 
arrayobject issue because scipy will shield them from it (at least on 
the Python level).

    Of course Numeric will still be there and if someone else wants to 
maintain it, they are welcome to.  I'm just serving notice that any 
resources I have will be devoted to Numeric3, and as soon as Numeric3 is 
out of alpha I would actually like to call it Numeric (verson 30.0).

> But it seems to me that there is a danger for the "yet another 
> package" effect to occur. I think I will remain sceptical unless you 
> achieve three things: 1) It has the most important improvements that 
> numarray  has. 2) It has a good API that can be used to write packages 
> that work with Numeric3/SciPy and Numarray (the latter probably will 
> not go away). 3) Inclusion in the main Python tree, so that it is 
> guaranteed to be available.

Thanks for your advice.  All encouragement is appreciated. 

1)  The design document is posted.  Please let me know what "the most 
important improvements" are.

2)  It will have an improved API and I would also like to support a lot 
of the numarray API as well (although I I don't understand the need for 
many of the API calls numarray allows and see much culling that needs to 
be done --- much input is appreciated here as to which are the most 
important API calls.  I will probably use nd_image as an example of what 
to support).

3) Inclusion in the main Python tree will rely on Guido feeling like 
there is a consensus.  I'm hopeful here and will try to pursue this -- 
but I need to finish code first.

> Jochem Küpper just outlined very well how it could look like: A small 
> core, plus a common project with packages at different levels. I think 
> it is a very good idea, and probably similar to what SciPy is trying 
> to do now. But he suggests an explicit division between independent 
> packages: basic packages, packages with external library dependencies 
> like FFTW, and advanced packages. Maybe something like that should be 
> set up if we get an arraybobject into the Python core.

Sounds great.  SciPy has been trying to do exactly this, but we need 
ideas --- especially from package developers who understand the issues 
--- as to how to set it up correctly.  We've already re-factored a 
couple of times.  We could do it again if we needed to, so that the 
infrastructure had the right feel.   A lot of this is already in place.  
I don't think many recognize some of the work that has already been 
done.  This is not an easy task. 

>
> BTW it was mentioned before that it would be a problem to remove 
> packages like LinearAlgebra and FFT from the core Numeric. matplotlib 
> was mentioned as an example of a package that depends on them. I think 
> that points however to a fundamental problem with matplotlib: why does 
> a plotting library need FFTs and linear algebra? So I don't think 
> anybody can really argue that things like an FFT should be in a core 
> array package.

My point exactly.  Having these things in Numeric/ Numarray actually 
encourages the creation of multiple separate packages as everybody tries 
to add just there little extra bit on top.

Plotting is potentially problematic because there are a lot of ways to 
plot.  I think we need to define interfaces in this regard and adapters 
so that commands that would throw up a plot could use several different 
plotting methods to do it.   I'm not favoring any plotting technique, so 
don't pre-guess me.  My ideas of plotting are probably very similiar to 
John's with matplotlib.  His work is another that I'm disappointed is 
not part of scipy and has led me to my current craziness with Numeric3 :-)

>
> Agreed about the single standard thing. But I am not willing to just 
> 'join' the SciPy project to achieve it (at least for nd_image). I am 
> however very interested in the possibility of writing against a small 
> high-quality array package that is included in the pyhton core. That 
> would be all the standard I need. If you manage to make SciPy into a 
> useful larger standard on top of that, great, more power to all of us!

Why not?  Your goals are not at odds with ours.   O.K. it may be that 
more work is required to re-write nd_image against the Numeric C-API 
than you'd like --- that's why Numeric3 is going to try and support the 
numarray C-API as well.

Thanks for your valuable feedback.   I appreciate the time it took you 
to provide it.   I hope we can work more together in the future.

Best regards,

-Travis






More information about the Numpy-discussion mailing list