[Numpy-discussion] Calculations with mixed type structured arrays

josef.pktd@gmai... josef.pktd@gmai...
Thu Jun 4 10:30:30 CDT 2009


After yesterdays discussion, I wanted to see if views of structured
arrays with mixed type can be easily used.

Is the following useful for the numpy user guide?

Josef


Calculations with mixed type structured arrays
----------------------------------------------

>>> import numpy as np

The following array has two integer and three float columns

>>> dt = np.dtype([('a', '<i4'), ('b', '<i4'), ('c', '<f8'),
...                ('d', '<f8'), ('e', '<f8')])

>>> xs = np.ones(3,dt)
>>> print xs.shape
(3,)
>>> print repr(xs)
array([(1, 1, 1.0, 1.0, 1.0), (1, 1, 1.0, 1.0, 1.0), (1, 1, 1.0, 1.0, 1.0)],
      dtype=[('a', '<i4'), ('b', '<i4'), ('c', '<f8'), ('d', '<f8'),
('e', '<f8')])

If we try to view it as float then the memory of the two
integers in the record are lumped together and we get
numbers that don't represent our data correctly and we
loose one element per record

If the memory cannot be interpreted under the new dtype,
we get an exception instead::

>>> print xs.view(float)
[  2.12199579e-314   1.00000000e+000   1.00000000e+000   1.00000000e+000
   2.12199579e-314   1.00000000e+000   1.00000000e+000   1.00000000e+000
   2.12199579e-314   1.00000000e+000   1.00000000e+000   1.00000000e+000]
>>> print xs.view(float).shape
(12,)


>>> dt0 = np.dtype([('a', '<i4'), ('c', '<f8'),
...                ('d', '<f8'), ('e', '<f8')])
>>> np.ones(3,dt0).view(float)
Traceback (most recent call last):
ValueError: new type not compatible with array.


However, we can construct a new dtype that creates
views on the integer part and the float part separately

>>> dt2 = np.dtype([('A', '<i4',2), ('B', '<f8', 3)])

>>> print repr(xs.view(dt2))
array([([1, 1], [1.0, 1.0, 1.0]), ([1, 1], [1.0, 1.0, 1.0]),
       ([1, 1], [1.0, 1.0, 1.0])],
      dtype=[('A', '<i4', 2), ('B', '<f8', 3)])

Now we are able to access the two subarrays and perform
calculations with them

>>> print xs.view(dt2)['B'].mean(0)
[ 1.  1.  1.]
>>> print xs.view(dt2)['A'].mean(0)
[ 1.  1.]

We can also assign new names to the two views and calculate
(almost) as if they were regular arrays.
The new variables are still only a view on the original
memory. If we change them, then also the original
structured array changes:

>>> xva = xs.view(dt2)['A']
>>> xvb = xs.view(dt2)['B']

>>> xva *= range(1,3)
>>> xvb[:,:] = xvb*range(1,4)

>>> print xs
[(1, 2, 1.0, 2.0, 3.0) (1, 2, 1.0, 2.0, 3.0) (1, 2, 1.0, 2.0, 3.0)]
>>> print xva.mean(0)
[ 1.  2.]
>>> print xvb.mean(0)
[ 1.  2.  3.]


More information about the Numpy-discussion mailing list