[SciPy-user] Record array help
Mon May 19 10:13:57 CDT 2008
> Not sure whether to ask here or on the matplotlib list, but since it's
> mainly a numpy/scipy issue I thought I'd try here first.
You'll find some basic information about record arrays on that link:
> 1. Is it possible to change the dtype of a field after the record array has
> been created?
I'm afraid you can't. However, you can always create a new dtype afterwards,
and allocate it to your record array.
> 2. The CSV file has missing data points - how do I turn these into python
> 'None' elements in the record array?
You may want to try numpy.ma.mrecords, that gives the possibility to mask
specific fields in a record array (instead of masking whole records).
However, the module is still experimental, and some tweaking will be
> 3. Is it possible to obtain a subset of the original data (corresponding to
> two or more columns of the CSV file) as a conventional 2D numpy array, or
> can I access the data only individually by column (i.e. field in the record
Yes, you can get a subset:
>>>import numpy as np
>>># Define some fields
>>>a = np.arange(10,dtype=int)
>>>b = np.arange(10,1,-1,dtype=int)
>>>c = np.random.rand(10)
>>>ndtype = [('a',int),('b',int),('c',float)]
>>># Define your record array
>>>mrec = np.array(zip(a,b,c), dtype=ndtype)
>>># Get a subset #1: by selecting fields
>>>subset_1 = np.column_stack([mrec['a'],mrec['b']])
>>># Get a subset #2: by changing the view
>>>subset_2 = mrec.view((int,3))[:,2]
Method #2 is quite useful if your fields have the same dtype: that way, you
can switch from records/fields to lines/columns seamlessly.
More information about the SciPy-user