[Numpy-discussion] Re: Bytes Object and Metadata

Francesc Altet faltet at carabos.com
Mon Mar 28 07:17:10 CST 2005


Hi Travis, Scott,

I've been following your discussions and I'm very happy that Travis
has finally decided to go with adopting the bytes object in Numeric3.
It's also very important that from the discussions, you finally
reached an almost complete agreement on how to support the __array__
protocol. I do think that this idea is both very simple and powerful.

I do hope this would be a *major* step towards interchanging data
between differents applications and packages and, perhaps, this would
render almost a non-sense the final goal of including a specific
ndarray object in the Python standard library: this simply should be
not necessary at all!

A Dilluns 28 Març 2005 11:30, Travis Oliphant va escriure:
[snip]
> As I've already said, it would be easy to check for the more specialized
> attributes at object creation time to boot-strap an array from an
> arbitrary object.
[snip]
> Let's go ahead and get some __array__XXXXX  attribute names decided on.
> I'll put them in the Numeric3 code base (I could also put them in old
> Numeric and make a 24.0 release as well --- I need to do that because of
> a horrible bug in the new empty method:   Numeric.empty(<shape>, 'O').

Very nice! From what you stated above I deduce that you will be
including a case in the Numeric.array constructor so that it can
create a properly defined array if the sequence that is passed to it
fulfils the __array__ protocol.

In addition, if the numarray people would be willing to do the same
thing, I envision a very easy (and very efficient) way to convert
from/to Numeric to/from numarray (until Numeric3 would be ready for
production), something like:

NumericArray = Numeric.array(numarrayArray)
numarrayArray = numarray.array(NumericArray)

Internally, one should decide which is the optimum way to convert from
one object to the other. Based on suggestions from Todd Miller on how
to do this as efficiently as possible, I have arrived to the
conclusions that the next conversions are the most efficient ones:

In [69]:na = numarray.arange(100*1000,shape=(100,1000))
In [70]:num = Numeric.arange(100*1000);num=num.resize((100,1000))

In [72]:t1=time();num2=Numeric.fromstring(na._data, 
typecode=na.typecode());num2=num2.resize(na.shape);time()-t1
Out[72]:0.0017759799957275391
In 
[73]:t1=time();na2=numarray.fromstring(num.tostring(),type=num.typecode(),shape=num.shape);time()-t1
Out[73]:0.0039050579071044922

Both ways, although very efficient, still copy the data area in the
conversion process. In the future, when Numeric3 will support the
bytes object, there will be no copy of memory at all for interchanging
data with another package (i.e. numarray).  Until then, the __array__
protocol may contribute to share data (well, at least contiguous data)
efficiently between applications right now.

A big thanks to Scott for suggesting and heartedly defending the bytes
object and to Travis for unrecklessly becoming a convert. We, the
developers of extensions, will be grateful forever :-)

Cheers,

-- 
>qo<   Francesc Altet     http://www.carabos.com/
V  V   Cárabos Coop. V.   Enjoy Data
 ""






More information about the Numpy-discussion mailing list