[AstroPy] reading one line from many small fits files

Perry Greenfield perry@stsci....
Fri Aug 3 05:16:22 CDT 2012

On Aug 2, 2012, at 11:40 PM, John K. Parejko wrote:

> Follow up on this:
> Erin's suggestion to use fitsio gave me a factor of more than 10  
> improvement in speed. I was quite astonished at how much faster it  
> was, so I've written up a short example, and attached it. On my  
> laptop (13" macbook pro, OS X 10.6.8, regular HDD), the code  
> produces the following:
> $ python fits_tester.py
> fitsio version: 0.9.0
> pyfits version: 3.0.6
> Single pass: fitsio took  1.14109 seconds.
> Single pass: pyfits took 14.64361 seconds.
> One of the problems with the pyfits version is that I don't know how  
> to efficiently get at row(n) of a pyfits object in a form that can  
> be directly ingested into an ndarray. If there is a way to make the  
> pyfits version significantly faster just by calling pyfits  
> differently, I'm all ears.
> Looking at the profiles for the runs (output to .prof files), it  
> looks like pyfits is doing a lot of object creation and destruction  
> in the background, which may be what's killing it.
> Anyway, there does seem to be a major difference in speed here, even  
> in what is probably the most favorable configuration for pyfits,  
> with it running last and thus having files potentially cached.
> Assuming this difference isn't just me, is way to get these speed  
> improvements merged into pyfits?
I'm not so familiar with the internals now, but one approach we can  
take is to expose a "bare" ndarray for a table. That probably should  
short-circuit a lot of the object stuff that has to deal with special  
FITS cases like mapping FITS booleans into numpy booleans, bscale/ 
bzero, etc. (of course, that means you have to do that yourself if any  
of these cases exist). I'll have to ask Erik about that. There still  
would be the time needed to parse the header to determine the  
structure of the array.


More information about the AstroPy mailing list