[Numpy-discussion] [Fwd: compression in storage of Numeric/numarray objects]

Perry Greenfield perry at stsci.edu
Fri Sep 9 13:53:07 CDT 2005


On Sep 9, 2005, at 4:41 PM, Joost van Evert wrote:

> On Fri, 2005-09-09 at 15:06 -0500, John Hunter wrote:
>>>>>>> "Joost" == Joost van Evert <phjoost at gmail.com> writes:
>>
>>     Joost> is it possible to use compression while storing
>>     Joost> numarray/Numeric objects?
>>
>>
>> Sure
>>
>>     In [35]: s = rand(10000)
>>
>>     In [36]: file('uncompressed.dat', 'wb').write(s.tostring())
>>
>>     In [37]: ls -l uncompressed.dat
>>     -rw-r--r--  1 jdhunter jdhunter 80000 2005-09-09 15:04 
>> uncompressed.dat
>>
>>     In [38]: gzip.open('compressed.dat', 'wb').write(s.tostring())
>>
>>     In [39]: ls -l compressed.dat
>>     -rw-r--r--  1 jdhunter jdhunter 41393 2005-09-09 15:04 
>> compressed.dat
>>
> Thanks, this helps me, but I think not enough, because the arrays I 
> work
> on are sometimes >1Gb(Correlation matrices). The tostring method would
> explode the size, and result in a lot of swapping. Ideally the
> compression also works with memmory mapped arrays.
>
Well, it seems to me that you are asking for quite a lot if you expect 
it to work with memory-mapped arrays that are compressed (I'm assuming 
you mean that individual values are decompressed on the fly as they are 
needed). This is something that we gave some thought to a few years 
ago, but it seemed that supporting such capabilities was far too 
complicated, at least for now. Besides some operations are bound to 
blow up (e.g., take on a compressed array).

But I'm still not sure what you are trying to do and what you would 
like to see happen underneath. An example would do a lot to explain 
what your needs are.

Thanks, Perry Greenfield





More information about the Numpy-discussion mailing list