[Numpy-discussion] Comments on the Numarray/Numeric disscussion
perry at stsci.edu
Thu Jan 22 19:08:59 CST 2004
Travis Oliphant writes:
> The two major problems I see with Numarray replacing Numeric are
> 1) How is UFunc support? Can you create ufuncs in C easily (with a
> single function call or something similar).
Different, but I don't think it is difficult to add ufuncs (and
probably easier if many types must be supported, though I doubt
that is much of an issue for most mathematical functions which
generally are only needed for the float types and perhaps complex).
> 2) Speed for small arrays (array creation is the big one).
This is the much harder issue. I do wonder if it is possible to
make numarray any faster than Numeric on this point (or as other
later mention, whether the complexity that it introduces is worth
> It is actually quite a common thing to have a loop during which many
> small arrays get created and destroyed. Yes, you can usually make such
> code faster by "vectorizing" (if you can figure out how). But the
> average scientist just wants to (and should be able to) just write a loop.
I'll pick a small bone here. Well, yes, and I could say that a
scientist should be able to write loops that iterate over all
array elements and expect that they run as fast. But they can't.
After all, using an array language within an interpreted language
implies that users must cast their problems into array manipulations
for it to work efficiently. By using Numeric or numarray they *must*
buy into vectorizing at some level.
Having said that, it certainly is true that there are problems
with small arrays that cannot be easily vectorized by combining
into higher dimension arrays (I think the two most common cases
are with variable-sized small arrays or where there are
iterative algorithms on small arrays that must be iterated many
times (though some of these problems can be cast into larger
vectors, but often not really easily).
> Regarding speed issues. Actually, there are situations where I am very
> unsatisfied with Numeric's speed performance and so the goal for
> Numarray should not be to achieve some percentage of Numeric's
> performance but to beat it.
> Frankly, I don't see how you can get speed that I'm talking about by
> carrying around a lot of extras like byte-swapping support,
> memory-mapping support, record-array support.
You may be right. But then I would argue that if one want to speed
up small array performance, one should really go for big improvements.
To do that suggests taking a signifcantly different approach than
either Numeric or numarray. But that's a different topic ;-)
To me, factors of a few are not necessarily worth the trouble
(and I wonder how much of the phase space of problems they really
help move into feasibility). Yes, if you've written a bunch of programs
that use small arrays that are marginally fast enough, then a factor
of two slower is painful. But there are many other small array problems
that were too slow already that couldn't be done anyway. The ones
that weren't marginal will likely still be acceptable.
Those that live in the grey zone now are the ones that are most sensitive
to the issue. All the rest don't care. I don't have a good feel
for how many live in the grey zone. I know some do.
More information about the Numpy-discussion