[SciPy-user] Reading in data as arrays, quickly and easily?

Perry Greenfield perry at stsci.edu
Mon Jul 12 15:36:31 CDT 2004


> Eric Jonas wrote:
> Well, I had been focusing on numarray, because everything I read seems
> to suggest that it's the wave of the future, although at the same time
> no one really seems to be using it much yet. May I ask how much larger
> than 1 GB?  I'm dealing with between 1-20 GB EEG files, and for some
> reason I don't thinK I'll be able to afford 64-bit hardware in the near
> future : ) 
> 
Even with a 64-bit system, currently Numeric and numarray are limited
in the size of the array to 32-bit sizes because of the size Python
uses for indexing sequences (fortunately work appears to be underway, 
or soon underway, to change that). So you won't be able to map a whole
20 GB array though it is possible to map sections of a file.

> What I really want is to read in some fairly complex records, do endian
> swapping, alignment, etc. all in C. I'm mostly interested in spectral
> analysis, so the hope was that I'd be able to read in 32kB chunks at a
> time for my periodograms. 
>  
Depending on how complex the records are, recarray may be useful as 
well as PyTables. It depends on the details of the data so I can't
really say what the best approach is unless I knew those. I'd say 
that so long as your data are regularly spaced in the file in some
manner, there is a reasonably efficient way to read the data into
numarray. If the array data is contiguous, then the options that
Travis mentioned for Numeric should work well also (and I'm sure
there are some tricks that can be played with more complex
representations if you are willing to use extra memory and rearrange
byte arrays with non-contiguous data).

Perry



More information about the SciPy-user mailing list