[Numpy-discussion] Counting array elements

Todd Miller jmiller at stsci.edu
Fri Nov 5 03:56:13 CST 2004


On Thu, 2004-11-04 at 17:50, Chris Barker wrote:
> Todd Miller wrote:
>  >> What I was suggesting is that
> >>there should be an API for accessing the elements of an array that 
> >>doesn't rely on the standard strides approach. I guess I'm expressing my 
> >>disappointment that PyArrays don't follow one of the axioms of Object 
> >>Oriented Programming: Encapsulation. I should be able to get element 
> >>(i,j) of an array without knowing the data structures used to store the 
> >>data. 
> > 
> > (I think) numarray has what you're talking about:  the "element-wise
> > API".  It's documented in the manual but AFIK is fairly slow and
> > probably not widely used.
> > 
> 
> Well, the "fairly slow" is the issue. Along with the not widely used.
> >>If we had that, then there could be a 1-d "flat" array that 
> >>supported discontiguous arrays in a different way than the strides 
> >>approach, while sharing the same data block as the parent N-d array.
> > 
> > 
> > The numarray "element-wise" API makes use of strides internally in order
> > to access array elements;  it does, however, hide what it's doing.
> 
> I'm no C wiz, but by being macros, it looks to me like they very much 
> depend on the PyArrayObject that is passed in storing it's data with 
> strides, etc. anyway, so they couldn't be used with an object with a 
> different storage scheme.
> 
>  >  I
>  > don't understand the approach you're suggesting here though.  Can you
>  > elaborate?
> 
> What I'm getting at is classic OO polymorphism: An Array class that has 
> a GetElement1d(i) method that returns the element. This class could then 
> be replaced with another class that uses a completely different internal 
> storage mechanism, but still has a GetElement1d(i) method. I know we're 
> working with C, rather than C++ here, but I think this kind of thing 
> could be faked with enough typecasting. On the other hand, I don't know 
> what the heck I'm talking about. I'm no C wiz.

We've already got this;  The C equivalent of what you're talking about
is:

Float64 NA_get1_Float64(PyArrayObject *a, int i)

> Your comment about performance above is key, however. If this approach 
> has worse performance than doing pointer arithmetic by hand with 
> Array->strides et al, then it wouldn't get used universally, and we'd be 
> back were we started.

This is definitely the case.

>  I know even less C++ than C, but I think perhaps 
> the only way to get this with adequate performance would be to do a lot 
> of C++ template magic, like Blitz++.

C++ has function in-lining and templates so I think it is possible to
get cleaner, higher performance solutions to this problem.  
Unfortunately,  we purposely avoid C++ for portability reasons.

> In the early days of numarray development, there was discussion about 
> using Blitz++ (or other nifty C++ template based arrays) as the basis 
> for numarray. I think it all really boiled down to the template magic 
> required was not well supported by enough compilers, so it couldn't be 
> used. I think that's a shame, as while I haven't used C++ much, 
> templates an iterators and all look very appealing, and much better than 
> all the hassles of pointer arithmetic an static typing of C.
> 
> >>Anyway, I'm just dreaming, I suppose, we're pretty committed to the 
> >>current approach!
> > 
> > Good ideas have a way of getting adopted, so dream on...
> > 
> 
> Well, yes, but the core API of numarray is pretty well established by now.
> 
> >>Very cool! I'm still using Numeric, but I think next time I need to 
> >>write my own Ufunc extension, this may be what converts me!
> 
> By the way, the two reasons I still use numeric, other than inertia, are:
> 
> 1) slower small array performance: I use arrays a lot for the 
> convenience, rather than just when I have large arrays and need the 
> performance.

This is a very difficult nut to crack and not top priority for
numarray.   It always seems to degenerate into "move everthing into C"
which is 180 degrees opposed to the original design/implementation
philosophy of numarray.

> 2) Much slower performance when passing arrays into wxPython, due to 
> wxPython using the generic sequence interface, which is apparently much 
> slower with numarray than Numeric. Has this changed?

No.  I haven't gotten around to fixing this yet.  We are getting better
at providing optional numarray/Numeric support for 3rd party packages, 
so at some point this should fall out even if numarray's sequence
protocol is not sped up;  much faster access is possible directly
through the PyArrayObject and can be done optionally.  I'm not saying
that's our plan,  just that there are multiple options on the table and
eventually we'll hit a combination that works here.

Regards,
Todd









More information about the Numpy-discussion mailing list