[Numpy-discussion] Re: large file and array support

Perry Greenfield perry at stsci.edu
Tue Mar 29 18:26:35 CST 2005


On Mar 29, 2005, at 9:11 PM, Travis Oliphant wrote:

> There are two distinct issues with regards to large arrays.
>
> 1) How do you support > 2Gb memory mapped arrays on 32 bit systems and 
> other large-object arrays only a part of which are in memory at any 
> given time (there is an equivalent problem for > 8 Eb (exabytes) on 64 
> bit systems,  an Exabyte is 2^60 bytes or a giga-giga-byte).
>
> 2) Supporting the sequence protocol for in-memory objects on 64-bit 
> systems.
>
> Part 2 can be fixed using the recommendations Martin is making and 
> which will likely happen (though it could definitely be done faster).  
> Handling part 1 is more difficult.
>
> One idea is to define some kind of "super object" that mediates 
> between the large file and the in-memory portion.  In other words, the 
> ndarray is an in-memory object, while the super object handles 
> interfacing it with a larger structure.
>
> Thoughts?

Maybe I'm missing something but isn't it possible to mmap part of a 
large file? In that case one just limits the memory maps to what can be 
handled on a 32 bit system leaving it up to the user software to 
determine which part of the file to mmap. Did you have something more 
automatic in mind? As for other large-object arrays I'm not sure what 
other examples there are other than memory mapping. Do you have any?

Perry





More information about the Numpy-discussion mailing list