[Numpy-discussion] [ANN] NumPy 1.0b4 now available

Les Schaffer schaffer at optonline.net
Sun Aug 27 09:06:55 CDT 2006


Travis:

thanks for your response. over the next couple days i will be working
with the records module, trying to fix things so we can move from
numarray to numpy. i will try to collect some docstrings that can be
added to the code base.


Travis Oliphant wrote:
> Your right.  It's an oversight that needs to be corrected.    NumPy has 
> a very capable records facility and the great people at STSCI have been 
> very helpful in pointing out issues to help make it work reasonably like 
> the numarray version.  In addition, the records.py module started as a 
> direct grab of the numarray code-base, so I think I may have mistakenly 
> believed it was equivalent.  But, it really should also be in the 
> numarray compatibility module.
>   

this would solve our problem in the short run, so at least we can switch
to numpy and keep our code running.

> The same is true of the chararrays defined in numpy with respect to the 
> numarray.strings module.
>   

i take it this might solve the problem (below) of the automagic strip
with the numarray package.

>> 2. my code that uses recarrays is now broken if i use
>> numpy.core.records; for one thing, you have no .info attribute. 
>>     
> All the attributes are not supported.  The purpose of 
> numpy.numarray.alter_code1 is to fix those attributes for you to numpy 
> equivalents.  In the case of info, for example, there is the function 
> numpy.numarray.info(self) instead of self.info().
>   

thanks. i wasn't clear how to call the info function. now when i try
this, i get:

Traceback (most recent call last):
  File "<stdin>", line 772, in ?
  File "<stdin>", line 751, in _test_TableManager
  File "<stdin>", line 462, in build_db_table_structures
  File "<stdin>", line 108, in _create_tables_structure_from_rsrc
  File "C:\Program
Files\Python24\Lib\site-packages\numpy\numarray\functions.py", line 350,
in info
    print "aligned: ", obj.flags.isaligned
AttributeError: 'numpy.flagsobj' object has no attribute 'isaligned'

>   
>> another example: strings pushed into the arrays *apparently* were stripped
>> automagically in the old recarray (so we coded appropriately), but now
>> are not.  
>>   
>>     
> We could try and address this in the compatibility module (there is the 
> raw ability available to deal with this exactly as numarray did).   
> Someone with more experience with numarray would really be able to help 
> here as I'm not as aware of these kinds of issues, until they are 
> pointed out. 
>   

this would be great, because then i could find out where else code is
broke   ;-)

i will make my code changes in such a way that i can keep testing for
incompatibilities. so for now, i will add code to strip the
leading/trailing spaces off, but suitably if'ed so when this gets fixed
in numpy, i can pull out the strips and see if anything else works
differently than numarray.records.

>> 3. near zero docstrings for this module, hard to see how the new
>> records  works.
>>   
>>     
> The records.py code has a lot of code taken and adapted from numarray 
> nearly directly.   The docstrings present there were also copied over, 
> but nothing more was added.  There is plenty of work to do on the 
> docstrings in general.  This is an area, that even newcomers can 
> contribute to greatly.  Contributions are greatly welcome.
>   

ok, i will try and doc suggestions to whomever they should be sent to.

>> 4. last year i made a case for the old records to return a list of the
>> column names. 
>>     
> I prefer the word "field" names now so as to avoid over-use of the word 
> "column"

i have columnitis because we are parsing excel spreadsheets and pushing
them into recarrays. the first row of each spreadsheet has a set of
column names -- errrr, field names -- which is why we originally
attracted to records, since it gave us a way to grab columns -- errr,
fields -- easily and out of the box.

> but one thing to understand about the record array is that it 
> is a pretty "simple" sub-class.  And the basic ndarray, by itself 
> contains the essential functionality of record arrays.   The whole 
> purpose of the record sub-class is to come up with an interface that is 
> "easier-to use," (right now that just means allowing attribute access to 
> the field names).   Many may find that using the ndarray directly may be 
> just what they are wanting and don't need the attribute-access allowed 
> by the record-array sub-class.
>   

i'll look into how the raw ndarray works. like i said, we like that we
can get a listing of each column like so:

  recObj['column_errrr_fieldname']
>   
>> it looks like the column names are now attributes of the
>> record object, any chance of getting a list of them
>> recarrayObj.get_colNames() or some such? 
>>     
> Right now, the column names are properties of the data-type object 
> associated with the array, so that  recarrayObj.dtype.names will give 
> you a list
>
> The data-type object also has other properties which are useful. 
>   

it looks too like one can now create an ordinary array and PUSH IN
column -- errr, field -- information with dtype, is that right? pretty
slick if true.

i have some comments on the helper functions for creating record and
recarray objects, but i will leave that for later.

Les
>   




More information about the Numpy-discussion mailing list