[Numpy-discussion] ragged array implimentation

Francesc Alted faltet@pytables....
Mon Mar 7 13:18:00 CST 2011

A Monday 07 March 2011 19:42:00 Christopher Barker escrigué:
> But now that you've entered the conversation, does HDF and/or
> pytables have a standard way of dealing with this?

Well, I don't think there is such a 'standard' way for dealing with 
ragged arrays, but yes, pytables has support for them.  Creating them is 

# Create a VLArray:
fileh = tables.openFile('vlarray1.h5', mode='w')
vlarray = fileh.createVLArray(fileh.root, 'vlarray1',
                              "ragged array of ints",
# Append some (variable length) rows:
vlarray.append(array([5, 6]))
vlarray.append(array([5, 6, 7]))
vlarray.append([5, 6, 9, 8])

Then, you can access the rows in a variety of ways, like iterators:

print '-->', vlarray.title
for x in vlarray:
    print '%s[%d]--> %s' % (vlarray.name, vlarray.nrow, x)

--> ragged array of ints
vlarray1[0]--> [5 6]
vlarray1[1]--> [5 6 7]
vlarray1[2]--> [5 6 9 8]

or via __getitem__, using general fancy indexing:

a_row = vlarray[2]
a_list = vlarray[::2]
a_list2 = vlarray[[0,2]]   # get list of coords
a_list3 = vlarray[[0,-2]]  # negative values accepted
a_list4 = vlarray[numpy.array([True,...,False])]  # array of bools

but, instead of returning a numpy array of 'object' elements, plain 
python lists are returned instead.  More info on VLArray object in:

> is a "vlen array" stored contiguously in netcdf?

I don't really know, but one limitation of variable length arrays in 
HDF5 (and hence NetCDF4) is that they cannot be compressed (but that 
should be addressed in the future).

Francesc Alted

More information about the NumPy-Discussion mailing list