[SciPy-user] What is fastest load/save matrix methods?

Dave Kuhlman dkuhlman at cutter.rexx.com
Mon Dec 19 22:27:05 CST 2005


On Tue, Dec 20, 2005 at 10:49:14AM +0900, Cournapeau David wrote:
> On Mon, 2005-12-19 at 11:01 +0100, Arnd Baecker wrote:
> > On Mon, 19 Dec 2005, Hugo Gamboa wrote:
> > 
> > > Hi there,
> > >
> > > I need to work with large matrixes that come in ascii format.
> > >

[snip]

> 
> pytables uses hdf5 file format as a storage model, with a high level
> abstraction when you want to save a lot of data in one file.
> 
> http://pytables.sourceforge.net/html/WelcomePage.html

Good suggestion.

Am I right that PyTables does not *directly* support scipy arrays?
That's not a big problem.  You can write a scipy array to an HDF5
file using PyTables with something like the following::

    filename = 'testpytables1.h5'
    dataset1 = [[1,2],[3,4],[5,6]]
    h5file = tables.openFile(filename, mode = "w",
        title = "PyTables test file"$
    datasets = h5file.createGroup(h5file.root, "datasets", "Test data sets")
    array1 = scipy.array(dataset1)
    # Write array after converting to a numarray array.
    h5file.createArray(datasets, 'dataset1',
        numarray.array(array1.tolist()),
        "Test data set #1")
    h5file.close()

And, reading it back in:

    h5file = tables.openFile(filename, 'r')
    dataset1Obj = h5file.getNode('/datasets', 'dataset1')
    dataset1Array = scipy.array(dataset1Obj.read())
    h5file.close()

Is there a more efficient way?

Has anyone tried to modify PyTables so that it supports scipy
arrays directly?

Dave

-- 
Dave Kuhlman
http://www.rexx.com/~dkuhlman



More information about the SciPy-user mailing list