[Numpy-discussion] Huge arrays
Fri Sep 11 02:07:13 CDT 2009
On Tue, Sep 8, 2009 at 6:41 PM, Charles R Harris
> More precisely, 2GB for windows and 3GB for (non-PAE enabled) linux.
And just to further clarify, even with PAE enabled on linux, any
individual process has about a 3 GB address limit (there are hacks to
raise that to 3.5 or 4GB, but with a performance penalty). But 4 GB
is the absolute max addressable RAM for a single 32 bit process (even
if the kernel itself can use up to 64GB of physical RAM with PAE).
For gory details on Windows address space limits:
If running 64bit is not an option, I'd consider the "compress in RAM"
technique. Delta-compression for most sampled signals should be quite
doable. Heck, here's some untested pseudo-code:
data_row = numpy.zeros(2000000, dtype=numpy.int16)
# Fill up data_row
compressed_row_strings = 
data_row[1:] = data_row[:-1] - data_row[1:] # quick n dirty delta encoding
# Put a loop in there, reuse the row array, and you are almost all
set. The delta
# encoding is optional, but probably useful for most "real world" 1d signals.
# If you don't have the time between samples to compress the whole row, break
# it into smaller chunks (see zlib.compressobj())
More information about the NumPy-Discussion