[SciPy-user] Fastest way to read a matrix in

Travis E. Oliphant oliphant@enthought....
Tue Feb 5 17:44:16 CST 2008


Jose Luis Gomez Dans wrote:
> Hi!
>
>   
>> On Feb 5, 2008 9:15 AM, Jose Luis Gomez Dans <josegomez@gmx.net> wrote:
>> MxN array (or is it NxM? :D) I am using scipy.io.read_array(), but the
>> performance is fairly slow (these are 80x80ish arrays). While they are on NFS
>> mounts, other programs read the data in faster than python's
>> scipy.io.read_array, so I was wondering whether there's a faster way of reading the data in
>> (maybe giving hints on the number of columns and rows, using some other
>> function, etc)?
>>
>> Try numpy.fromfile()
>>     
>
> Aaaahhhh.... This was an improvement. It appears that numpy also has loadtxt(). A few simple examples show that read_array takes of the between 0.5-0.6 of wall time, with loadtxt taking 0.04 and fromfile() (your suggestion) 0.01 (same file, already in cache, repeat tests 10 times). That's 3 methods that look as if they do the same sort of thing, and three very different performances.
>   

Yes, we are trying to fix this.  In fact read_array will be deprecated 
in 0.7 and loadtxt will be promoted in NumPy.

The fromfile will always exist as a low-level routine (without any bells 
and whistles) which can handle very uniform file-layout, but it will not 
be advertised in a tutorial.

scipy.io.read_array suffers from feature creep which slows down simple 
operations.

-Travis



More information about the SciPy-user mailing list