[Numpy-discussion] recommendation for saving data

Christopher Barker Chris.Barker@noaa....
Mon Aug 1 12:08:21 CDT 2011

On 7/31/11 5:48 AM, Brian Blais wrote:
> I was wondering if there are any recommendations for formats for saving scientific data.

every field has it's own standards -- I'd try to find one that is likely 
to be used by folks that may care about your results.

For Oceanographic and Atmospheric modeling data, netcdf is a good 
option. I like the NetCDF4 python lib:


(there are others)

For broader use, and a bit more flexibility, HDF is a good option. There 
are at least two ways to use it with numpy:

PyTables: http://www.pytables.org

(Nice higher-level interface)


(a more raw HDF5 wrapper)

There is also the npz format, built in to numpy, if you are happy with 
requiring python to read the data.


  I am running a simulation, which has many somewhat-indepedent parts 
which have their own internal state and parameters.  I've been using 
pickle (gzipped) to save the entire object (which contains subobjects, 
etc...), but it is getting too unwieldy and I think it is time to look 
for a more robust solution.  Ideally I'd like to have something where I 
can call a save method on the simulation object, and it will call the 
save methods on all the children, on down the line all saving into one 
file.  It'd also be nice if it were cross-platform, and I could depend 
on the files being readable into the future for a while.
> Are there any good standards for this?  What do you use for saving scientific data?
> 		thank you,
> 			Brian Blais

Christopher Barker, Ph.D.

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception


More information about the NumPy-Discussion mailing list