[Numpy-discussion] Introduction
Scott Gilbert
xscottg at yahoo.com
Sun Apr 14 04:20:03 CDT 2002
Perry, I've been trying to be persuasive, but I think all I've
managed to do is to be verbose and annoy you. Please accept
my apologies.
I really am sorry this is going as poorly as it is. I'm doing a lousy
job of getting my point across, and I'd like to turn around the tone
this has taken. Email always comes off as more antagonistic
than intended.
Finally, my appeal to the fact that you are proposing a standard
was heavy handed. I guess I was trying to use that to force
you to consider my position. It clearly backfired...
I'll try to be more to the point.
Here's what I'm proposing, and it's only a suggestion.
*** I think the requirements for being a general purpose "NDArray"
can be specified with only the following attributes:
__array_buffer__ - as buffer object
__array_shape__ - as tuple of long
__array_itemsize__ - as int
Optionally
__array_stride__ - as tuple of long (get from shape if None)
__array_offset__ - as int (would default to 0 if not present)
Then anyone who implemented these could work with the same C API for
getting the pointer to memory, shape array, stride array, and item size.
The set of operations on a pure "NDArray" is probably pretty minimal
(reshape, transpose/rotate, index arrays?).
So in order to create a full featured "NumArray", a few more attributes
are required:
__array_itemtype__ - as string?
Optionally
__array_endian__ - as 1 char string? (default to the native endian)
This brings the total up to 4 required attributes, and 3 optional ones
for a very general purpose array data structure. (I can think of other
optional ones, but skip that for now.)
>
> All in all you are talking about checking quite a few attributes
> to make sure the object has the interface. And even if it does,
> *why* in the world would we presume that the C functions used by
> numarray would work properly with the object you provide.
>
Because truthfully arrays are little more than a pointer to memory.
That's like asking "why in the world would we presume memcpy() or
qsort() would know what to do with your memory?"
>
> You haven't provided any example (let
> alone a compelling one) of why we should accept any object that
> provides those attributes.
>
Well, the UFuncs certainly should reject any object that they don't
know how to handle. I'm currently only addressing what it takes to be
an NDArray/NumArray object. OTOH, if I can present something to the
UFuncs that looks like a known array type, why wouldn't UFuncs
want to work with it?
Ok, so what does this buy you?
Well, it probably doesn't buy you personally very much. Your needs are
already being met by the current implementation.
Ok, so what does this cost you?
A few translations:
_data -> __array_buffer__
_shape -> __array_shape__
_strides -> __array_stride__
_itemsize -> __array_itemsize__
_offset -> __array_offset__
_type -> __array_type__
_byteswap -> __array_endian__
This isn't a style criticism. I'm not just asking you to change your
names,
I'm asking to promote the names to be a "standard interface" much like
these things are in many places in Python.
Also requires some small changes to getNDInfo() and getNumInfo()
so that they can calculate the derived fields (contiguous, aligned,
etc...).
Also requires some changes to your scripts so that it checks for
the interface rather than the inheritance.
What are the benefits to anyone else?
- Describes how anyone could implement something that looks and acts
like NDArrays or NumArrays. There are probably a lot of reasons to
want to do this. I have some reasons that I don't think you value
too much. I think others would have reasons which I can't imagine too.
- Allows one standard API for getting at the basics of NDArrays/NumArrays
- Allows anyone to easily implement other data types for NumArrays.
The typecode won't match any of your builtin types, but maybe other
third parties could agree on other typecodes for their crazy needs and
share modules.
- Allows me personally to distribute a separate (and simpler)
implementation of NDArrays/NumArrays right now and have the same data
objects work with yours when you're all done. If I give the UFuncs a
pointer to memory, and the attributes above, why shouldn't it work
correctly?
>
> We're not going to budge until you show us what the hell you are talking
> about.
>
Am I doing any better? I am trying.
>
> You are right on complex ints (that we won't consider them). One
> could take numarray and add them if one wanted and have a more
> extended version. But we won't do it, and we wouldn't support as
> being in what we maintain. It's one of those trade offs.
>
Is there a way, today, without modifying numarray, for me to use
numarray as a holder for these esoteric data types? Is that way difficult?
Could it be easier?
I'm not asking numarray to know about my types in it's core baseline. I'm
wondering what it takes to implement new types at all.
>
> Your example shows nothing about what your
> real needs for the object are.
>
My real needs are all over the place. Some of which you've shown me
are solvable with the current implementation of numarray. Some of
which you've not addressed or said you won't address.
To be explicit:
Here are (at least most of) my _needs_ for array objects:
- support a wide variety of data types (user defined)
- have efficient storage
- support the pickle interface for serialization
- allow alternate sources of underlying memory
- have an easy interface for accessing the pieces
necessary to create C extensions (buffer, shape, stride, ...)
- completed and reliable in the near term
Here are (at least some of) my _wants_ for array objects:
- cooperate on some level with other standard array
modules (once the standard is set)
- have same API for accessing the pieces (buffer, shape,
stride, ...) as all standard array modules will.
- implementation in pure Python so that building extension
modules is not required until the fast operations present
in those modules is required.
- implemented from a standard that is as good as it can be
Here are (at least some of) my _whims_ for array objects:
- has "windowing" functionality to work efficiently with
really large files (on any modern platform).
- alternate implementations for things such as "slicing
behaviour" (copy on write, reference).
Loosely following your design, I've already written a module that meets
my "needs", I was hoping that we could cooperate towards filling in some
of my "wants" (cooperating array modules), and I've brought up my "whims"
because I thought they were interesting possibilities for discussion.
I was going to respond to some of your other remarks, but I've probably
wasted enough of your time. If you don't respond to this message, I'll
take that as a sign that we just aren't going to see eye to eye on any of
this, and I won't bother you any more.
(I'll be half surprised if you even get this message. From the tone
of your last one, I wouldn't be shocked to find out you've already
added me to your killfile. :-)
No hard feelings,
-Scott Gilbert
__________________________________________________
Do You Yahoo!?
Yahoo! Tax Center - online filing with TurboTax
http://taxes.yahoo.com/
More information about the Numpy-discussion
mailing list