[Numpy-discussion] Cython issues w/ 1.4.0

Dag Sverre Seljebotn dagss@student.matnat.uio...
Tue Dec 8 14:52:16 CST 2009


Robert Kern wrote:
> On Tue, Dec 8, 2009 at 12:38, Pauli Virtanen <pav@iki.fi> wrote:
>> ti, 2009-12-08 kello 12:14 -0600, Robert Kern kirjoitti:
>>> On Tue, Dec 8, 2009 at 12:08, Pauli Virtanen <pav@iki.fi> wrote:
>>>> ke, 2009-12-09 kello 02:47 +0900, David Cournapeau kirjoitti:
>>>> [clip]
>>>>> Of course, this does not prevent from applying your suggested change -
>>>>> I don't understand why you want to add it to 1.4.0, though. 1.4.0 does
>>>>> not break the ABI compared to 1.3.0. Or is it "just" to avoid the
>>>>> cython issue to reappear for 1.5.0 ?
>>>> Yes, it's to avoid having to deal with the Cython issue again in 1.5.0.
>>> Do we have any features on deck that would add a struct member? I
>>> think it's pretty rare for us to do so, as it should be.
>> If we want to support PEP 3118 on Py2.6, then new fields would be
>> useful:
>>
>> - Python2.6 currently has issues with PyArg_ParseTuple("s#", ...):
>>  defining a bf_releasebuffer breaks that particular feature.
>>
>>  Consequently, if we want backwards compatibility, we cannot keep track
>>  of allocated memory using the Py_buffer structure, so something else
>>  is needed.
>>
>>  We can probably get this fixed in future Python 2.6/2.7 releases,
>>  making it prefer the old buffer interface.  The issue also most likely
>>  unfixable on Py3, since "s#" has semantics that are not really
>>  compatible with the new buffer interface.

How about this:
  - Cache/store the format string in a bytes object in a global 
WeakRefKeyDict (?), keyed by dtype
  - The array holds a ref to the dtype, and the Py_buffer holds a ref to 
the array (through the obj field).

Alternatively, create a new Python object and stick it in the "obj" in 
the Py_buffer, I don't think obj has to point to the actual object the 
buffer was acquired from, as long as it keeps alive a reference to it 
somehow (though I didn't find any docs for the obj field, it was added 
as an afterthought by the implementors after the PEP...). But the only 
advantage is not using weak references (if that is a problem), and it is 
probably slower and doesn't cache the string.

>> - We need to cache the buffer protocol format string somewhere,
>>  if we do not want to regenerate it on each buffer acquisition.
> 
> My suspicion is that YAGNI. I would wait until it is actually in use
> and we see whether it takes up a significant amount of time in actual
> code.

The slight problem with that is that if somebody discover that this is a 
bottleneck in the code, the turnaround time for waiting for a new NumPy 
release could be quite a while. Not that I think it will ever be a problem.


-- 
Dag Sverre


More information about the NumPy-Discussion mailing list