[SciPy-user] What is fastest load/save matrix methods?

Andrew Straw strawman at astraw.com
Tue Dec 20 00:46:20 CST 2005


Cournapeau David wrote:

>On Mon, 2005-12-19 at 20:27 -0800, Dave Kuhlman wrote:
>  
>
>>On Tue, Dec 20, 2005 at 10:49:14AM +0900, Cournapeau David wrote:
>>    
>>
>>>On Mon, 2005-12-19 at 11:01 +0100, Arnd Baecker wrote:
>>>      
>>>
>>>>On Mon, 19 Dec 2005, Hugo Gamboa wrote:
>>>>
>>>>        
>>>>
>>>>>Hi there,
>>>>>
>>>>>I need to work with large matrixes that come in ascii format.
>>>>>
>>>>>          
>>>>>
>>[snip]
>>
>>    
>>
>>>pytables uses hdf5 file format as a storage model, with a high level
>>>abstraction when you want to save a lot of data in one file.
>>>
>>>http://pytables.sourceforge.net/html/WelcomePage.html
>>>      
>>>
>>Good suggestion.
>>
>>Am I right that PyTables does not *directly* support scipy arrays?
>>That's not a big problem.  You can write a scipy array to an HDF5
>>file using PyTables with something like the following::
>>
>>    filename = 'testpytables1.h5'
>>    dataset1 = [[1,2],[3,4],[5,6]]
>>    h5file = tables.openFile(filename, mode = "w",
>>        title = "PyTables test file"$
>>    datasets = h5file.createGroup(h5file.root, "datasets", "Test data sets")
>>    array1 = scipy.array(dataset1)
>>    # Write array after converting to a numarray array.
>>    h5file.createArray(datasets, 'dataset1',
>>        numarray.array(array1.tolist()),
>>        "Test data set #1")
>>    h5file.close()
>>
>>And, reading it back in:
>>
>>    h5file = tables.openFile(filename, 'r')
>>    dataset1Obj = h5file.getNode('/datasets', 'dataset1')
>>    dataset1Array = scipy.array(dataset1Obj.read())
>>    h5file.close()
>>
>>Is there a more efficient way?
>>    
>>
>
>Is there no way to convert numarray to scipy more directly than
>going through list ? Concerning implementing scipy array support, what 
>is the difference between scipy array and numarray ?
>  
>

With sufficiently recent versions of scipy, Numeric, and numarray,
y=numstar.asarray(x) will create a view of the array with no copying,
which is pretty fast.

"Sufficiently recent" means released within the last 2 months or so.



More information about the SciPy-user mailing list