[Numpy-discussion] Introduction

Perry Greenfield perry at stsci.edu
Mon Apr 15 14:20:01 CDT 2002


Hi Scott,

I'm not going to respond to all points but mainly concentrate on the
last section.
>

>
> Important Question:  If an NDArray had a typecode (and it was a known
> string), is it possible to promote it to one of the standard NumArray
> types?
>
I think we want to avoid NDArray having any type attribute (Some types
have subtypes and then the issue gets really messy). We leave it
to the subclass to address how types will be handled.

> Here goes (somewhat hypothetical, but close to the boat I'm currently in):
>
> Jon is our FPGA guy who makes screaming fast core files, but our FPGAs
> don't do floating point.  So I have to provide his driver with
> ComplexInt16
> data.
>
> Jon and I write an extension module that calls his driver and reads data.
> We also write a C routine (call it "munge") that takes both ComplexInt16
> data, and ComplexFloat64 data.  We try it out for testing, and pass in my
> arrays in both places.  We could have used Numarray for the
> ComplexFloat64,
> but that meant we had to use two array packages, and use two C-APIs in our
> extension.  All we needed was a pointer to an array of doubles,
> so we stuck
> with mine.
>
> Ok, that part of development is done.  Now we present it to the
> application
> developers.  Their happy and we're rolling.  Successful application.
>
> Another group find out about this and they want to use it.  They're using
> numarray for a large part of their application.  In fact, their
> calculating
> the ComplexFloat64 half the data that they want to pass to my "munge"
> routine using numarray, and they still need to use my ComplexInt32 data to
> read the FPGA.
>
> They're going to be disappointed to find out my extension can't read
> numarray data, and that they have to convert back and forth between the
> two.  And as the list of routines grow, they have to keep track of whether
> it is a numarray-routine, or a scottarray-routine.
>
> It's not so bad for one simple "munge" function, but there are going to be
> hundreds of functions...
>
> I don't expect you to have much sympathy for my having to convert
> data back
> and forth between my array types and yours, but it is an
> avoidable problem.
>
>
>
> For the most part, we both agree on what parts an NDArray should have.  If
> we could only agree what to name them, and that we'd stick to those names,
> that would be a large part of it for me.
>
>
I'm not sure I understand the problem in all the details I need to.
I'll restate it as best as I understand it and you can tell me if
I understood incorrectly.

You have extension modules that get complex int data from hardware.
Other processing may be done to the complex int data in that format
so it doesn't make sense to convert it to a more standard format when
reading it in. You have C extensions that carry out certain tasks
on complex data (in either complex int format or complex floats).
You have users that would like to use your routine with numarray.
(I haven't seen any specific mention of the need for ufuncs on
complex ints so I'll assume you just need complex int arrays as
containers for C programs to use.)

[If you did need to perform ufuncs on complex ints, then extending
numarray locally to handle them would be one possibility, but a little
involved at the moment (a little easier later when we reimplement
complex), then again, maybe not, the complex stuff is currently
subclassed from numarray and not that hard to adapt to ints I think,
but it isn't that well done now].

I guess my initial reaction is that you should develop a front-
end C-API that handles obtaining data buffers from different
sources.  You get to define what kinds of things it supports,
and changes to either the list of types you support and localizes
any dependencies on our or anyone else's api to a small section of
code. From what I'm hearing, you don't need it to provide much
(pointer to arrays and associated information). If we are real
bozos and change the interface, it doesn't hurt you much (not that
we intend to be bozos or change the C-API willy nilly :-)

To elaborate, you define your equivalent of our getNumInfo routine

I don't think I've seen anything that requires explicit dependencies
on Python attributes. Sure, you could use the same attribute names
and use Python calls to get those just as our getNumInfo routine does,
but I think that is bad practice. You may find some other representation
for arrays out there that doesn't fit this model and you may want to
work with those also and you won't be able to get them to adopt our
scheme.

You say that you don't want your users to have to convert between
the two data representations. If they are using your C extensions
that is understandable, and avoidable since you've written your
programs to deal with the various types. On the other hand,
unless you extend numarray, numarray clearly cannot deal with
the complex ints so conversion is necessary. But understandably,
you would like to eliminate the need for explicit conversions.
I think there is an easy way of dealing with this.

We haven't implemented this capability yet but we've been talking
about having numarray check input values to see if they have
a method "tonumarray" [not that we would choose that particular
method name, I'm just illustrating the point]. If that
method did exist, it would be called to create a numarray
from the object. Thus you could add such a method to your
class and when it is used in numarray ufuncs or in binary operations
with numarray objects, your complex ints are automatically
converted to numarray objects (presumably a complex float of
some precision). Adding this capability to numarray should be
pretty easy.

True, the solution that I proposed doesn't protect you from making
any changes ever. But we believe we are at a stage in the project
where it is dangerous to lock ourselves into lower level details
such as the internal description of the array. We still have things
to implement and that may cause us to realize that some changes
are needed. Our C-API stuff is relatively new. It may see changes
in the near future, but likely not many related to what you need.
And we intend to shield the C-API from changes in the Python
attributes. We could change the name or contents of _byteswap and
it would not change anything in the C-API. I see premature
coupling of low level implementation details as a bad thing,
not a good thing. Any change that are made to the API require
changes only the corresponding routine in your C-API, and all
your C applications are shielded from any changes (save rebuilding).

If I've misunderstood your examples, please let me know.

Perry





More information about the Numpy-discussion mailing list