[AstroPy] PyFITS and mmap
Thu Sep 22 11:21:52 CDT 2011
Every now and then PyFITS gets support requests from people trying to
work with very large FITS files (>4GB; I've seen as high as 50 GB) and
having trouble when they run out of memory.
Normally I point them to the memmap=True option to pyfits.open(), and
that works for them. On 64-bit systems in particular there's more than
enough virtual address space to mmap very large files.
And I got to thinking that while most FITS files I encounter are not
many gigabytes in size, they are still over 100 MB. And there are only
so many operations that actually require having an entire array in
memory at once. So maybe it would make sense to have PyFITS use mmap by
There could be some slight performance implications here: For example,
when reading the data a little bit a time mmap is a little a bit slower,
unsurprisingly. But in practice I don't think it's a very noticeable
difference, and the benefit--far less memory usage and more transparent
support for large files--outweigh any drawbacks I can think of.
I'm just putting this out there because I wonder if there are any other
downsides to this that I'm not thinking of.
More information about the AstroPy