[Numpy-discussion] stacking record arrays

Skipper Seabold jsseabold@gmail....
Mon Jun 21 13:51:53 CDT 2010


On Mon, Jun 21, 2010 at 2:44 PM, Benjamin Root <ben.root@ou.edu> wrote:
> Hello,
>
> I ran into a somewhat counter-intuitive situation that probably should be
> documented somewhere with respect to record (structured?) arrays.  I wanted
> to stack multiple arrays together that had the same names for the columns.
> Since I was imagining the columns as having the names (as opposed to rows),
> I figured vstack() would do the job, but it merely created a sequence of
> record arrays.  Turns out that I had to use hstack() to get what I wanted.
> Here is an example of what I am running into:
>
>>>> tracks[0]
> array([('M', 87.087318420410156, 223.30305480957031, 1),
>        ('M', 76.440017700195312, 227.68635559082031, 2),
>        ('M', 66.116767883300781, 233.32769775390625, 3),
>        ('M', 53.614784240722656, 239.75303649902344, 4),
>        ('M', 40.807880401611328, 245.34136962890625, 5),
>        ('M', 28.479558944702148, 250.30470275878906, 6),
>        ('S', 0.0, 0.0, -9)],
>       dtype=[('types', '|S1'), ('xLocs', '<f4'), ('yLocs', '<f4'),
> ('frameNums', '<i4')])
>
>>>> tracks[1]
> array([('M', 184.38957214355469, 114.406494140625, 1),
>        ('M', 197.28269958496094, 133.10012817382812, 2),
>        ('M', 209.65650939941406, 151.74812316894531, 3),
>        ('M', 223.28224182128906, 171.2159423828125, 4),
>        ('M', 238.75798034667969, 190.44374084472656, 5),
>        ('M', 254.47175598144531, 211.06010437011719, 6),
>        ('S', 0.0, 0.0, -9)],
>       dtype=[('types', '|S1'), ('xLocs', '<f4'), ('yLocs', '<f4'),
> ('frameNums', '<i4')])
>
>
>>>> np.vstack((tracks[0], tracks[1]))
> array([[('M', 87.087318420410156, 223.30305480957031, 1),
>         ('M', 76.440017700195312, 227.68635559082031, 2),
>         ('M', 66.116767883300781, 233.32769775390625, 3),
>         ('M', 53.614784240722656, 239.75303649902344, 4),
>         ('M', 40.807880401611328, 245.34136962890625, 5),
>         ('M', 28.479558944702148, 250.30470275878906, 6),
>         ('S', 0.0, 0.0, -9)],
>        [('M', 184.38957214355469, 114.406494140625, 1),
>         ('M', 197.28269958496094, 133.10012817382812, 2),
>         ('M', 209.65650939941406, 151.74812316894531, 3),
>         ('M', 223.28224182128906, 171.2159423828125, 4),
>         ('M', 238.75798034667969, 190.44374084472656, 5),
>         ('M', 254.47175598144531, 211.06010437011719, 6),
>         ('S', 0.0, 0.0, -9)]],
>       dtype=[('types', '|S1'), ('xLocs', '<f4'), ('yLocs', '<f4'),
> ('frameNums', '<i4')])
>
>>>> np.hstack((tracks[0], tracks[1]))
> array([('M', 87.087318420410156, 223.30305480957031, 1),
>         ('M', 76.440017700195312, 227.68635559082031, 2),
>         ('M', 66.116767883300781, 233.32769775390625, 3),
>         ('M', 53.614784240722656, 239.75303649902344, 4),
>         ('M', 40.807880401611328, 245.34136962890625, 5),
>         ('M', 28.479558944702148, 250.30470275878906, 6),
>         ('S', 0.0, 0.0, -9), ('M', 184.38957214355469, 114.406494140625, 1),
>         ('M', 197.28269958496094, 133.10012817382812, 2),
>         ('M', 209.65650939941406, 151.74812316894531, 3),
>         ('M', 223.28224182128906, 171.2159423828125, 4),
>         ('M', 238.75798034667969, 190.44374084472656, 5),
>         ('M', 254.47175598144531, 211.06010437011719, 6),
>         ('S', 0.0, 0.0, -9)],
>       dtype=[('types', '|S1'), ('xLocs', '<f4'), ('yLocs', '<f4'),
> ('frameNums', '<i4')])
>
> By the way, both methods will return record arrays as expected.  However, it
> would be a 2-d array for vstack and 1-d for hstack.
> Maybe I have been seeing this wrong, but I hope this helps out anyone else
> who might have been confused.
>

For a little more control, you might try

import numpy.lib.recfunctions as nprf
nprf.stack_arrays((arr1,arr2),usemask=False)

If you're working with structured arrays a lot, the functions in nprf
become pretty handy.  It's still not imported into the numpy namespace
by default, but hopefully there will be some time for a review and
inclusion at some point, because it's pretty useful.

Skipper


More information about the NumPy-Discussion mailing list