[AstroPy] PyFITS and mmap

Erik Bray embray@stsci....
Thu Sep 22 11:21:52 CDT 2011


Hi all,

Every now and then PyFITS gets support requests from people trying to 
work with very large FITS files (>4GB; I've seen as high as 50 GB) and 
having trouble when they run out of memory.

Normally I point them to the memmap=True option to pyfits.open(), and 
that works for them.  On 64-bit systems in particular there's more than 
enough virtual address space to mmap very large files.

And I got to thinking that while most FITS files I encounter are not 
many gigabytes in size, they are still over 100 MB.  And there are only 
so many operations that actually require having an entire array in 
memory at once.  So maybe it would make sense to have PyFITS use mmap by 
default.

There could be some slight performance implications here: For example, 
when reading the data a little bit a time mmap is a little a bit slower, 
unsurprisingly.  But in practice I don't think it's a very noticeable 
difference, and the benefit--far less memory usage and more transparent 
support for large files--outweigh any drawbacks I can think of.

I'm just putting this out there because I wonder if there are any other 
downsides to this that I'm not thinking of.

Thanks,
Erik


More information about the AstroPy mailing list