[Numpy-discussion] Re: Trying out Numeric3

David M. Cooke cookedm at physics.mcmaster.ca
Wed Mar 23 03:24:54 CST 2005


On Wed, Mar 23, 2005 at 12:11:44AM -0700, Travis Oliphant wrote:
> David M. Cooke wrote:
> >Travis Oliphant <oliphant at ee.byu.edu> writes:
> >>Michiel Jan Laurens de Hoon wrote:
> >>>Travis Oliphant wrote:
> >>>>Michiel Jan Laurens de Hoon wrote:
> >>>>>Another warning was that PyArrayObject's "dimensions" doesn't seem
> >>>>>to be an int array any more.
> >>>>Yes.   To allow for dimensions that are bigger than 32-bits,
> >>>>dimensions and strides are (intp *).  intp is a signed integer with
> >>>>sizeof(intp) == sizeof(void *).  On 32-bit systems, the warning
> >>>>will not cause problems.  We could worry about fixing it by
> >>>>typedefing intp to int (instead of the current long for 32-bit
> >>>>systems).
> >Why not use Py_intptr_t? It's defined by the Python C API already (in
> >pyport.h).
> Sounds good to me.  I wasn't aware of it (intp or intptr is shorter 
> though).

Some reasons not to use those two:
1) intp is too short for an API. The user might be using it already.
2) the C99 type for this is intptr_t. Py_intptr_t is defined to be
   the same thing.

But let's step back a moment: PyArrayObject is defined like this:

typedef struct PyArrayObject {
    PyObject_HEAD
    char *data;
    int nd;
    intp *dimensions;
    intp *strides;
    ...

Thinking about it, I would say that dimensions should have the type of
size_t *. size_t is the unsigned integer type used to represent the
sizes of objects (it's the type of the result of sizeof()). Thus, it's
guaranteed that an element of size_t should be large enough to contain
any number that we could use as an array dimension. size_t is also
unsigned.

Also, since the elements of strides are byte offsets into the array,
strides should be of type ptrdiff_t *. The elements are used by adding
them to a pointer.

Is there a good reason why data is not of type void *? If it's char *,
it's quite easy to make the mistake of using data[0], which is probably
*not* what you want. With void *, you would have to cast it, as you
should be doing anyways, or else the compiler complains. Also, assigning
to the right pointer, like double *A = array->data, doesn't need
casts like it does with data being a char *. In Numeric, char * is
probably a holdover when Numeric had to compile with K&R-style C. But,
we know we have ANSI C89 ('cause that's what Python requires).

So I figure it should look like this:

typedef struct PyArrayObject {
    PyObject_HEAD
    void *data;
    int nd;
    size_t *dimensions;
    ptrdiff_t *strides;
    ...

I've really started to appreciate size_t when trying to make programs
work correctly on my 64-bit machine :-) It's not just another pretty
face.

> >An array of longs would seem to be the best solution. On the two
> >64-bit platforms I have access to (an Athlon 64 and some Alphas),
> >sizeof(long) == 8, while my two 32-bit platforms (Intel x86 and
> >PowerPC) have sizeof(long) == 4.
> >
> I thought about this, but what about the MS Window compilers where long 
> is still 4 byte (even on a 64-bit system),  so that long long is the 
> size of  a pointer on that system.   I just think we should just create 
> an integer that will be big enough and start using it.

I don't know about ptrdiff_t, but sizeof(size_t) *should* be 8 on 64-bit
Windows.

> >For comparison, here's a list of sizes for various platforms
>...
> Nice table,  thanks...

There's a another one (for all sorts of Linux systems) at
http://www.xml.com/ldd/chapter/book/ch10.html#t1

> >Also note
> >that size_t (which is the return type of sizeof()) is not int in
> >general (although lots of programs treat it like that).
> >
> >Using long for the dimensions also means that converting to and from
> >Python ints for indices is transparent, and won't fail, as Python ints
> >are C longs. This is the cause of several of the 64-bit bugs I fixed
> >in the latest Numeric release (23.8).
> > 
> >
> The conversion code has been updated so that it won't fail if the sizes 
> are actually the same for your platform.
> 
> >[I'd help with Numeric3, but not until it compiles with fewer than
> >several hundred warnings -- I *really* don't want to wade through all
> >that.]
> Do the warnings really worry you that much?   Most are insignificant.
> You could help implement a method or two pretty easily.  Or help with
> the ufunc module.

They really obscure significant warnings, though. And most look like
they can be dealt with. Right now, it doesn't compile for me.

I'll just list a few general cases:
- arrayobject.h redefines ushort, uint, ulong (they're defined in
  sys/types.h already for legacy reasons)
- functions taking no arguments should be defined like
void function(void)
not
void function()
(which is an old style that actually means the argument list isn't
specified, not that it takes no arguments)
- then a bunch of errors with typos, and things not defined.

I might get some time to track some down, but it's limited also :-)

-- 
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke                      http://arbutus.physics.mcmaster.ca/dmc/
|cookedm at physics.mcmaster.ca




More information about the Numpy-discussion mailing list