[Numpy-discussion] Designing a new storage format for numpy recarrays
Zachary Pincus
zachary.pincus@yale....
Fri Oct 30 09:26:21 CDT 2009
Unless I read your request or the documentation wrong, h5py already
supports pulling specific fields out of "compound data types":
http://h5py.alfven.org/docs-1.1/guide/hl.html#id3
> For compound data, you can specify multiple field names alongside
> the numeric slices:
> >>> dset["FieldA"]
> >>> dset[0,:,4:5, "FieldA", "FieldB"]
> >>> dset[0, ..., "FieldC"]
Is this latter style of access what you were asking for? (Or is the
problem that it's not fast enough in hdf5, even with the shuffle
filter, etc?)
So then the issue is that there's a dependency on hdf5 and h5py? (or
if you want to access LZF-compressed files without h5py, a dependency
on hdf5 and the C LZF compressor?). This is pretty lightweight,
especially if you're proposing writing new code which itself would be
a dependency. So your new code couldn't depend on *anything* else if
you wanted it to be a fewer-dependencies option than hdf5+h5py, right?
Zach
More information about the NumPy-Discussion
mailing list