[Numpy-discussion] Re: Bytes Object and Metadata
oliphant at ee.byu.edu
Mon Mar 28 15:26:32 CST 2005
> Just to add my two cents, I don't think I ever thought it was
> necessary to bundle the metadata with the memory object for the
> reasons Scott outlined. It isn't needed functionally, and there are
> cases where the same memory may be used in different contexts (as is
> done with our record arrays).
I'm glad we've worked that one out.
> Numarray, when it uses the buffer object, always gets a fresh pointer
> for the buffer object for every data access. But Scott is right that
> that pointer is good so long as there isn't a chance for something
> else to change it. In practice, I don't think that ever happens with
> the buffers that numarray happens to use, but it's still a flaw of the
> current buffer object that there is no way to ensure it won't change.
One could see it as a "flaw" in the buffer object, but I prefer to see
it as problesm with objects that use the PyBufferProcs protocol. It is
at worst, a "limitation" of the buffer interface that should be
advertised (in my mind the problem lies with the objects that make use
of the buffer protocol and also reallocate memory willy-nilly since
Python does not allow for this). To me, an analagous situation occurs
when an extension module writes into memory it does not own and causes a
seg-fault. I suppose a casual observer could say this is a Python flaw
but clearly the problem is with the extension object.
It certinaly does not mean at all that something like a buffer object
should never exist or that the buffer protocol should not be used. I
get the feeling sometimes, that some naive (to Numeric and numarray)
people on python-dev feel that way.
> I'm not sure how the support for large data sets should be handled. I
> generally think that it will be very awkward to handle these until
> Python does as well. Speaking of which...
> I had been in occasional contact with Martin von Loewis about his work
> to update Python to handle 64-bit addressing. We weren't planning to
> handle this in nummarray (nor Numeric3, right Travis or do I have that
> wrong?) until Python did. A few months ago Martin said he was mostly
> done. I had a chance to talk to him at Pycon about where that work
> stood. Unfortunately, it is not turning out to be as easy as he hoped.
> This is too bad. I have a feeling that this work is going to stall
> without help on our (numpy community) part to help make the changes or
> drum beating to make it a higher priority. At the moment the Numeric3
> effort should be the most important focus, but I think that after
> that, this should become a high priority.
I would be interested to hear what the problems are. Why can't you
just change the protocol replacing all int's with Py_intptr_t? Is
backward compatibilty the problem? This seems like it's on the extension
code level (and then only on 64-bit systesm), and so would be easier to
force through the change in Python 2.5.
Numeric3 will suffer limitations whenever the sequence protocol is
used. We can work around it as much as possible (by not using the
sequence protocol whenever possible), but the limitation lies firmly in
the Python sequence protocol.
More information about the Numpy-discussion