[Numpy-discussion] Cython issues w/ 1.4.0
Dag Sverre Seljebotn
dagss@student.matnat.uio...
Tue Dec 8 14:52:16 CST 2009
Robert Kern wrote:
> On Tue, Dec 8, 2009 at 12:38, Pauli Virtanen <pav@iki.fi> wrote:
>> ti, 2009-12-08 kello 12:14 -0600, Robert Kern kirjoitti:
>>> On Tue, Dec 8, 2009 at 12:08, Pauli Virtanen <pav@iki.fi> wrote:
>>>> ke, 2009-12-09 kello 02:47 +0900, David Cournapeau kirjoitti:
>>>> [clip]
>>>>> Of course, this does not prevent from applying your suggested change -
>>>>> I don't understand why you want to add it to 1.4.0, though. 1.4.0 does
>>>>> not break the ABI compared to 1.3.0. Or is it "just" to avoid the
>>>>> cython issue to reappear for 1.5.0 ?
>>>> Yes, it's to avoid having to deal with the Cython issue again in 1.5.0.
>>> Do we have any features on deck that would add a struct member? I
>>> think it's pretty rare for us to do so, as it should be.
>> If we want to support PEP 3118 on Py2.6, then new fields would be
>> useful:
>>
>> - Python2.6 currently has issues with PyArg_ParseTuple("s#", ...):
>> defining a bf_releasebuffer breaks that particular feature.
>>
>> Consequently, if we want backwards compatibility, we cannot keep track
>> of allocated memory using the Py_buffer structure, so something else
>> is needed.
>>
>> We can probably get this fixed in future Python 2.6/2.7 releases,
>> making it prefer the old buffer interface. The issue also most likely
>> unfixable on Py3, since "s#" has semantics that are not really
>> compatible with the new buffer interface.
How about this:
- Cache/store the format string in a bytes object in a global
WeakRefKeyDict (?), keyed by dtype
- The array holds a ref to the dtype, and the Py_buffer holds a ref to
the array (through the obj field).
Alternatively, create a new Python object and stick it in the "obj" in
the Py_buffer, I don't think obj has to point to the actual object the
buffer was acquired from, as long as it keeps alive a reference to it
somehow (though I didn't find any docs for the obj field, it was added
as an afterthought by the implementors after the PEP...). But the only
advantage is not using weak references (if that is a problem), and it is
probably slower and doesn't cache the string.
>> - We need to cache the buffer protocol format string somewhere,
>> if we do not want to regenerate it on each buffer acquisition.
>
> My suspicion is that YAGNI. I would wait until it is actually in use
> and we see whether it takes up a significant amount of time in actual
> code.
The slight problem with that is that if somebody discover that this is a
bottleneck in the code, the turnaround time for waiting for a new NumPy
release could be quite a while. Not that I think it will ever be a problem.
--
Dag Sverre
More information about the NumPy-Discussion
mailing list