[SciPy-dev] Changes in SVN scipy_core

Travis Oliphant oliphant.travis at ieee.org
Thu Dec 15 04:58:29 CST 2005


Francesc Altet wrote:

>Travis,
>
>A Dimecres 14 Desembre 2005 23:56, Travis Oliphant va escriure:
>  
>
>>I want to let you all know about the recent changes in the SVN version
>>of scipy_core.  As you may recall, I dramatically improved the way the
>>data in an array can be understood by improving the PyArray_Descr
>>structure and making it an object.  As part of that improvement, it
>>became clear that the NOTSWAPPED flag was an anomaly and shouldn't be a
>>flag on the array itself.  The byte-order of the data is a property of
>>the data-type descriptor.  This is especially clear when considering
>>records which according to the array protocol can have some fields in
>>one byte order and some in another.
>>    
>>
>
>It is not clear to me whether supporting different byte orders in the
>same recarray (or plain array) would be necessary. This question arose
>some moths ago in the numpy list, and not even Perry Greenfield was 
>able to realize an example on situations this would be useful.
>  
>
It's not clear to me when such beasts would be useful as well.  However, 
there is no effort at all to supporting them once you place the idea of 
byte-swapping in the data-type descriptor where it belongs.   Doing this 
actually cleans up a lot of code because logically, the data-type 
descriptor is the right place for the concept of machine byte-order. 

My general thinking has been that the whole point of record arrays is to 
be able to support memory-mapped versions of files with arbitrary 
fixed-length records.  Part of this has been to abstract the notion of 
data-type descriptor to where its at now. 

>My opinion about this is that this feature complicates unnecessarily
>the code, potentially making the treatment of non-native byteorder
>records more costly.
>
I don't see why what I've done would cause this.  If you don't specify 
byteorders in your data-type descriptors you always get native byteorder 
which always executes the quickest. 

>BTW, I've seen in the latest SVN checkout that you are making great
>progress in porting the recarray object. I'd like to have the code a
>look and try to port our nestedrecarray as well, and offer it for your
>consideration. I've seen also that you started a new unit test module
>for recarray. That's great! I hope to contribute some tests units for
>it as well.
>  
>
I'm really excited about the progress of records in scipy_core.  I think 
I've been able to get so far so quickly as soon as I abstracted the 
notion of data-type descriptor which was always sitting there 
under-utilized in Numeric.    For the first time, the full 
array_interface __array_descr__ protocol is supported (well I haven't 
actually placed the couple of lines in that would consume the interface, 
but it is exported...) including nested records...

Of course there are probably still some outstanding issues that will 
crop up as more people test.  One issue, for example is that the 
recarray class currently returns recarray objects when accessing fields 
(because that is the default behavior of every subclass).   In theory 
this should be fine, but another approach is to return actual ndarray's 
and chararrays instead.   

Thanks for the feedback.  Absent the feedback, I've done a lot of 
"moving forward" which I hope isn't too awful once people see start to 
understand and utilize what is there...

-Travis







More information about the Scipy-dev mailing list