[Numpy-discussion] Reading a big netcdf file

Christopher Barker Chris.Barker@noaa....
Thu Aug 4 12:02:19 CDT 2011


On 8/4/11 3:46 AM, Kiko wrote:
> In [9]: z4 = gebco4.variables['z']
>
> I got no problems and I have:
>
> In [14]: type(z4); z4.shape; z4.size
> Out[14]: <type 'netCDF4.Variable'>
> Out[14]: (233312401,)
> Out[14]: 233312401
>
> But if I do:
>
> In [15]: z4 = gebco4.variables['z'][:]

> MemoryError

> What's the difference between z4 as a netCDF4.Variable and as a
> numpy.ndarray?

a netCDF4.Variable is an object that holds the properties of the 
variable, but does not actually load the dat from the file into memory 
until it is needed, so, it doesn't matter how big the data is at this point.

> The results of ncdump -h
...
>          short z_range(side) ;
>                  z_range:units = "user_z_unit" ;

On 8/4/11 8:53 AM, Jeff Whitaker wrote:
> Kiko: I think the difference may be that when you read the data with
> netcdf4-python, it tries to unpack the short integers to a float32
> array.

Jeff, why is that? is it an netcdf4 convention? I always thought that 
the netcdf data model matched numpy's quite well, including the clear 
choice and specification of data type. I guess I've mostly used float 
data anyway, so hadn't noticed this, but ti comes as a surprise to me!

 > gebco4.set_automaskandscale(False)

> before reading the data from the getco4 variable, it will work, since
> this turns off the auto conversion to float32.

Thanks -- I'll have to remember that.

  > You'll have to do the conversion manually then, at which point you will
> may run out of memory anyway.

why would you have to do the conversion at all? (OK, you may, depending 
on your use case, but for the most part, data stored in a file as an 
integer type would be suitable for use in an integer array)

-Chris

-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker@noaa.gov


More information about the NumPy-Discussion mailing list