[Numpy-discussion] Re: Bytes Object and Metadata

Francesc Altet faltet at carabos.com
Mon Mar 28 08:52:18 CST 2005

A Dilluns 28 Març 2005 17:13, Francesc Altet va escriure:
> Based on suggestions from Todd Miller on how
> to do this as efficiently as possible, I have arrived to the
> conclusions that the next conversions are the most efficient ones:
> In [69]:na = numarray.arange(100*1000,shape=(100,1000))
> In [70]:num = Numeric.arange(100*1000);num=num.resize((100,1000))
> In [72]:t1=time();num2=Numeric.fromstring(na._data,
> typecode=na.typecode());num2=num2.resize(na.shape);time()-t1
> Out[72]:0.0017759799957275391
> In
> [73]:t1=time();na2=numarray.fromstring(num.tostring(),type=num.typecode(),s
>hape=num.shape);time()-t1 Out[73]:0.0039050579071044922

Er, sorry, there is in fact a more efficient way to convert from a
Numeric object to a numarray object that doesn't require any data copy
at all. This is:

In [212]:num=Numeric.arange(100*1000, typecode="i");num=num.resize((100,1000))
In [213]:num[0,:5]
Out[213]:array([0, 1, 2, 3, 4],'i')
Out[214]:0.0001010894775390625 # takes just 100 us!
In [215]:na2[0,4] = 1   # modify a cell 
In [216]:num[0,:5]
Out[216]:array([0, 1, 2, 3, 1],'i')
In [217]:na2[0,:5]
Out[217]:array([0, 1, 2, 3, 1])  # na2 has been modified as well, so the
                                 # data area is shared between num and na2

in fact, its speed is independent of the array size (as it should be
for a non-data-copying procedure):

# Create a Numeric object 10x larger
In [218]:num=Numeric.arange(1000*1000, 
Out[219]:0.00010204315185546875  # 100 us again!

This is because numarray has chosen to use a buffer object internally,
and that the Numeric object can be wrapped by a buffer object without
any actual data copy.

That drives me to think that, if the bytes object (that seems to be
implemented by Numeric3) could wrap the buffer object where numarray
objects hold its data, the conversion between Numeric3 <--> numarray
(or, in general, between those packages that deal with bytes objects
and other packages that deal with buffer objects) can be done with a
cost of 1 (that is, independent of the data size).

If this cannot be done (I mean, to get a safe bytes object from a
buffer object and vice-versa), well, it should be a pity. Do you think
that would be possible at all?


>qo<   Francesc Altet     http://www.carabos.com/
V  V   Cárabos Coop. V.   Enjoy Data

More information about the Numpy-discussion mailing list