[Numpy-discussion] Speeding up numarray -- questions on its design

Fernando Perez Fernando.Perez at colorado.edu
Mon Jan 17 13:25:34 CST 2005

Hi all,

just some comments from the sidelines, while I applaud the fact that we are 
moving towards a successful numeric/numarray integration.

Perry Greenfield wrote:

> the array struct size matters. Do you have code with hundreds of 
> thousands
> of small arrays existing simultaneously?

I do have code with perhaps ~100k 'small' arrays (12x12x12 or so) in existence 
simultaneously, plus a few million created temporarily as part of the 
calculations.  Needless to say, this uses Numeric :)  What's been so nice 
about Numeric is that even with my innermost loops (carefully) coded in 
python, I get very acceptable performance for real-world problems.

Perrry and I had this conversation over at scipy'04, so this is just a 
reminder.  The Blitz++ project has faced similar problems of performance for 
their very flexible arrays classes, and their approach has been to have 
separate TinyVector/TinyMatrix classes.  These do not offer almost any of the 
fancier features of the default Blitz Arrays, but they keep the same syntactic 
behavior and identical semantics where applicable.  What they give up in 
flexibility, they gain in performance.

I realize this requires a substantial amount of work, but perhaps it will be 
worthwhile in the long run.  It would be great to have a numarray 
small_array() object which would not allow byteswapping, memory-mapping, or 
any of the extra features which make them memory and time consuming, but which 
would maintain compatibility with the regular arrays as far as arithmetic 
operators and ufunc application (including obviously lapack/blas/f2py usage). 
  I know I am talking from 50.000 feet up, so I'm sure once you get down to 
the details this will probably not be easy (I can already see difficulties 
with the size of the underlying C structures for C API compatibility).  But in 
the end, I think something like this might be the only way to satisfy all the 
disparate usage cases for numerical arrays in scientific computing.  Besides 
the advanced features disparity, a simple set of guidelines for the crossover 
points in terms of performance would allow users to choose in their own codes 
what to use.

At any rate, I'm extremely happy to see scipy/numarray integration moving 
forward.  My thanks to all those who are actually doing the hard work.



More information about the Numpy-discussion mailing list