[Numpy-discussion] checksum on numpy float array
Mon Dec 8 12:01:36 CST 2008
A Sunday 07 December 2008, Brennan Williams escrigué:
> OK so maybe I should....
> (1) not add some sort of checksum type functionality to my read/write
> these read/write methods simply read/write numpy arrays to a
> binary file which contains one or more numpy arrays (and nothing
> (2) replace my binary files iwith either HDF5 or PyTables
> my app is being used by clients on existing projects - in one case
> there are over 900 of these numpy binary files in just one project,
> albeit each file is pretty small (200KB or so)
> so.. questions.....
> How can I tranparently (or at least with minimum user-pain) replace
> my existing read/write methods with PyTables or HDF5?
> My initial thoughts are...
> (a) have an app version number and a data format version number which
> i can check against.
> (b) if data format version < 1.0 then read from old binary files
> (c) if app version number > 1.0 then write to new PyTables or HDF5
> (d) get clients to open existing project and then save existing
> project to semi-transparently convert from old to new formats.
Yeah. That would work perfectly. Also, there is a function in PyTables
named 'isHDF5File(filename)' that allow you to know whether a file is
in HDF5 format or not. You might want to use it and avoid to bother
with data format/app version issues.
> Francesc Alted wrote:
> > A Friday 05 December 2008, Andrew Collette escrigué:
> >>> Another possibility would be to use HDF5 as a data container. It
> >>> supports the fletcher32 filter  which basically computes a
> >>> chuksum for evey data chunk written to disk and then always check
> >>> that the data read satifies the checksum kept on-disk. So, if
> >>> the HDF5 layer doesn't complain, you are basically safe.
> >>> There are at least two usable HDF5 interfaces for Python and
> >>> NumPy: PyTables and h5py . PyTables does have support for
> >>> that right out-of-the-box. Not sure about h5py though (a quick
> >>> search in docs doesn't reveal nothing).
> >>>  http://rfc.sunsite.dk/rfc/rfc1071.html
> >>>  http://www.pytables.org
> >>>  http://h5py.alfven.org
> >>> Hope it helps,
> >> Just to confirm that h5py does in fact have fletcher32; it's one
> >> of the options you can specify when creating a dataset, although
> >> it could use better documentation:
> >> http://h5py.alfven.org/docs/guide/hl.html#h5py.highlevel.Group.cre
> >>ate _dataset
> > My bad. I've searched for 'fletcher' instead of 'fletcher32'. I
> > naively thought that the search tool in Sphinx allowed for partial
> > name finding. In fact, it is a pity it does not.
> > Cheers,
> Numpy-discussion mailing list
More information about the Numpy-discussion