[SciPy-user] Record array help

Pierre GM pgmdevlist@gmail....
Mon May 19 10:13:57 CDT 2008

> Not sure whether to ask here or on the matplotlib list, but since it's
> mainly a numpy/scipy issue I thought I'd try here first.

You'll find some basic information about record arrays on that link:

> 1. Is it possible to change the dtype of a field after the record array has
> been created?

I'm afraid you can't. However, you can always create a new dtype afterwards, 
and allocate it to your record array.

> 2. The CSV file has missing data points - how do I turn these into python
> 'None' elements in the record array?

You may want to try numpy.ma.mrecords, that gives the possibility to mask 
specific fields in a record array (instead of masking whole records). 
However, the module is still experimental, and some tweaking will be 

> 3. Is it possible to obtain a subset of the original data (corresponding to
> two or more columns of the CSV file) as a conventional 2D numpy array, or
> can I access the data only individually by column (i.e. field in the record
> array)?

Yes, you can get a subset:
>>>import numpy as np
>>># Define some fields
>>>a = np.arange(10,dtype=int)
>>>b = np.arange(10,1,-1,dtype=int)
>>>c = np.random.rand(10)
>>>ndtype = [('a',int),('b',int),('c',float)]
>>># Define your record array
>>>mrec = np.array(zip(a,b,c), dtype=ndtype)
>>># Get a subset #1: by selecting fields
>>>subset_1 = np.column_stack([mrec['a'],mrec['b']])
>>># Get a subset #2: by changing the view
>>>subset_2 = mrec.view((int,3))[:,2]

Method #2 is quite useful if your fields have the same dtype: that way, you 
can switch from records/fields to lines/columns seamlessly.

More information about the SciPy-user mailing list