[Numpy-discussion] Can I add rows and columns to recarray?

Benjamin Root ben.root@ou....
Mon Dec 6 13:00:30 CST 2010


On Mon, Dec 6, 2010 at 12:26 PM, Christopher Barker
<Chris.Barker@noaa.gov>wrote:

> On 12/5/10 7:56 PM, Wai Yip Tung wrote:
>
>> I'm fairly new to numpy and I'm trying to figure out the right way to do
>> things. Continuing on my question about using recarray as a relation.
>>
>
> note that recarrays (or structured arrays, AFAIK, the difference is
> atturube access only -- I don't use recarrays) are far more static than a
> database table. So you may really want to use a database, or maybe pytables.
> Or maybe even just stick with lists.
>
> But if you are keeping things in memory, should be able to do what you
> want.
>
>
>  In [339]: arr = np.array([
>>     .....:     (1, 2.2, 0.0),
>>     .....:     (3, 4.5, 0.0)
>>     .....:     ],
>>     .....:     dtype=[
>>     .....:         ('unit',int),
>>     .....:         ('price',float),
>>     .....:         ('amount',float),
>>     .....:     ]
>>     .....: )
>>
>> In [340]: data = arr.view(recarray)
>>
>>
>> One of the most common thing I want to do is to append rows to data.
>>
>
> numpy arrays do not naturally support appending, as you have discovered.
>
>
>   I
>> think concatenate() might be the method.
>>
>
> yes.
>
>
>  But I get a problem:
>>
>
>  In [342]: np.concatenate((data0,[1,9.0,9.0]))
>>
>> ---------------------------------------------------------------------------
>> TypeError                                 Traceback (most recent call
>> last)
>>
>> c:\Python26\Lib\site-packages\numpy\<ipython console>  in<module>()
>>
>> TypeError: expected a readable buffer object
>>
>
> concatenate expects two arrays to be joined. If you pass in something that
> can easily be turned into an array, it will work, but a tuple can be
> converted to multiple types of arrays, so it doesn't know what to do. So you
> need to re-construct the second array:
>
> a2 = np.array( [(3,5.5, 3)], dtype=dt)
> arr = np.concatenate( (arr, a2) )
>
>
>  In [343]: data.amount = data.unit * data.price
>>
>
> yup
>
>
>  But sometimes it may require me to add a new column not already exist,
>> e.g.:
>>
>> In [344]: data.discount_price = data.price * 0.9
>>
>>
>> How can I add a new column?
>>
>
> you can't. what you need to do is create a new array with a new dtype that
> includes the new field.
>
> The trick is that numpy only supports homogenous arrays -- evey item is the
> same data type. So when you could a strut array like above, numpy does not
> define it as a 2-d table, but rather, a 1-d array, each element of which is
> a structure.
>
> so you need to do something like:
>
> # create a new array
> data2 = np.zeros(len(data), dtype=dt2)
>
> # fill the array:
> for field_name in dt.fields.keys():
>    data2[field_name] = data[field_name]
>
> # now some calculations:
> data2['discount_price'] = data2['price'] * 0.9
>
> I don't know of a way to avoid that loop when filling the array.
>
> Better yet -- anticipate your needs and create the array with all the
> fields you need in the first place.
>
> You can see that ndarrays are pretty static -- struct arrays can be useful
> data storage, but are not very suitable when things are changing much.
>
> You could write a class that wraps an andarray, and supports what you need
> better -- it could be a pretty usefull general purpose class, too. I've got
> one that handle the appending part, but nothing with adding new fields.
>
> Here's appending with my class:
>
> data3 = accumulator.accumulator(dtype = dt2)
> data3.append((1, 2.2, 0.0, 0.0))
> data3.append((3, 4.5, 0.0, 0.0))
> data3.append((2, 1.2, 0.0, 0.0))
> data3.append((5, 4.2, 0.0, 0.0))
> print repr(data3)
>
> # convert to regular array for calculations:
> data3 = np.array(data3)
>
> # now some calculations:
> data3['discount_price'] = data3['price'] * 0.9
>
> You wouldn't have to convert to a regular array, except that I haven't
> written the code to support field access yet -- I don't think it would be
> too hard, though.
>
> I've enclosed some test code, and my accumulator class, in case you find it
> useful.
>
>
>
> -Chris
>
>
numpy.lib.recfunctions has a method for easily adding new columns.  Of
course, it really returns a new recarray rather than adding it to an
existing recarray.  Appending records to such an array, however is a
different story, and you have to do something like you demonstrated above.

Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20101206/29444b7d/attachment.html 


More information about the NumPy-Discussion mailing list