[Numpy-discussion] Bytes vs. Unicode in Python3

Dag Sverre Seljebotn dagss@student.matnat.uio...
Sat Dec 5 04:16:55 CST 2009


Francesc Alted wrote:
> A Thursday 03 December 2009 14:56:16 Dag Sverre Seljebotn escrigué:
>> Pauli Virtanen wrote:
>>> Thu, 03 Dec 2009 14:03:13 +0100, Dag Sverre Seljebotn wrote:
>>> [clip]
>>>
>>>> Great! Are you storing the format string in the dtype types as well? (So
>>>> that no release is needed and acquisitions are cheap...)
>>> I regenerate it on each buffer acquisition. It's simple low-level C code,
>>> and I suspect it will always be fast enough. Of course, we could *cache*
>>> the result in the dtype. (If dtypes are immutable, which I don't remember
>>> right now.)
>> We discussed this at SciPy 09 -- basically, they are not necesarrily
>> immutable in implementation, but anywhere they are not that is a bug and
>> no code should depend on their mutability, so we are free to assume so.
> 
> Mmh, the only case that I'm aware about dtype *mutability* is changing the 
> names of compound types:
> 
> In [19]: t = np.dtype("i4,f4")
> 
> In [20]: t
> Out[20]: dtype([('f0', '<i4'), ('f1', '<f4')])
> 
> In [21]: hash(t)
> Out[21]: -9041335829180134223
> 
> In [22]: t.names = ('one', 'other')
> 
> In [23]: t
> Out[23]: dtype([('one', '<i4'), ('other', '<f4')])
> 
> In [24]: hash(t)
> Out[24]: 8637734220020415106
> 
> Perhaps this should be marked as a bug?  I'm not sure about that, because the 
> above seems quite useful.

Well, I for one don't like this, but that's just an opinion. I think it 
is unwise to leave object which supports hash() mutable, because it's 
too easy to make hard to find bugs (sticking a dtype as a key in a dict 
is rather useful in many situations). There's a certain tradition in 
Python for leaving types immutable if possible, and dtype certainly 
feels like it.

Anyway, the buffer PEP can be supported simply by updating the buffer 
format string on the "names" setter, so it's an orthogonal issue.

BTW note that the buffer PEP provides for supplying names of fields:

T{
  i:one:
  f:other:
}

(or similar). NumPy should probably do so at one point in the future; 
the Cython implementation doesn't because Cython doesn't use this 
information.

-- 
Dag Sverre


More information about the NumPy-Discussion mailing list