[SciPy-user] Mapping a series of files.

Dharhas Pothina Dharhas.Pothina@twdb.state.tx...
Thu Aug 7 11:29:31 CDT 2008

There are some issues with converting to netcdf. Mainly the fact that there is no standard for unstructured grids in netcdf. Most of the tools work for structured grids. There have been a couple of attempts to come up with an unstructured grid netcdf standard but from what I can tell they petered out in 2006. We are struggling with this right now since we have a couple of different hydro models and are trying to define a common format so we can develop our analysis and vis tools.

My present idea is to write a module that abstracts the details of each model format and allows me to load the data into python.

Will your module work with unstructured grids?

- dharhas

>>> Charles Doutriaux <doutriaux1@llnl.gov> 8/7/2008 11:15 AM >>>

It's your files can be converted to netcdf (or grib), then we have a 
tool to do exactly what you want
basically you'd run
cdscan -x full.xml *.nc

And it would generate an xml file that would simulate being a full file

then using our cdms2 read module you would do

data =f("var",time=('2008-1','2008-7'))
It would figure out for you which files to open. You could even be more 
restrictive by selecting a sub region (latitude=(-20,20)) etc...

for more info:


Dharhas Pothina wrote:
> Hi,
> I've been following the thread on 'partially reading a file' with some interest and have a related question.
> So I have a series of large binary data files (1_data.dat, 2_data.dat, etc) that represent a 3D time series of data. Right now I am cycling through all the files reading the entire dataset to memory and extracting the subset I need. This works but is extremely memory hungry and slow and I'm running out of memory for datasets more than a year long. I could calculate which few files contain the  data I need and only read those in but that is a bit cumbersome and also doesn't help if I need a 1d or 2d slice of the whole time period.
> In the other thread Travis gave an example of using memmap to map a file to memory. Can I do this to with multiple files. ie use memmap to generate an array[x,y,z,t] that I can then use slicing to actually read what I need? Another complication is that each binary file has a header section and then a data section. By reading the first file I can calculate the offset for the data part of the file.
> thanks,
> - dharhas
> _______________________________________________
> SciPy-user mailing list
> SciPy-user@scipy.org 
> http:// projects.scipy.org/mailman/listinfo/scipy-user

SciPy-user mailing list

More information about the SciPy-user mailing list