[Numpy-discussion] NetCDF4/numpy question

Eric Firing efiring@hawaii....
Fri Jan 27 16:21:04 CST 2012

On 01/27/2012 11:18 AM, Howard wrote:
> Hi all
> I am a fairly recent convert to python and I have got a question that's
> got me stumped. I hope this is the right mailing list: here goes :)
> I am reading some time series data out of a netcdf file a single
> timestep at a time. If the data is NaN, I want to reset it to the
> minimum of the dataset over all timesteps (which I already know). The
> data is in a variable of type numpy.ma.core.MaskedArray called modelData.
> If I do this:
> for i in range(len(modelData)):
> if math.isnan(modelData[i]):
> modelData[i] = dataMin
> I get the effect I want, If I do this:
> modelData[np.isnan(modelData)] = dataMin
> it doesn't seem to be working. Of course I could just do the first one,
> but len(modelData) is about 3.5 million, and it's taking about 20
> seconds to run. This is happening inside of a rendering loop, so I'd
> like it to be as fast as possible, and I thought the second one might be
> faster, and maybe it is, but it doesn't seem to be working! :)

It would help if you would say explicitly what you mean by "doesn't seem 
to be working", ideally by providing a minimal complete example 
illustrating the problem.

Does modelData have masked values that you want to keep separate from 
your NaN values?  If not, you can do this:

y = np.ma.masked_invalid(modelData).filled(dataMin)

Then y will be an ordinary ndarray.  If this is not satisfactory because 
you need to keep separate some initially masked values, then you may 
need to save the initial mask and use it to turn y back into a masked array.

You may be running into trouble with your initial approach because using 
np.isnan on a masked array is giving a masked array, and I think trying 
to index with a masked array is not advised.

In [2]: np.isnan(np.ma.array([1.0, np.nan, 2.0], mask=[False, False, True]))
masked_array(data = [False True --],
              mask = [False False  True],
        fill_value = True)


> Any ideas would be much appreciated.
> Thanks
> Howard
> --
> Howard Lander <mailto:howard@renci.org>
> Senior Research Software Developer
> Renaissance Computing Institute (RENCI) <http://www.renci.org>
> The University of North Carolina at Chapel Hill
> Duke University
> North Carolina State University
> 100 Europa Drive
> Suite 540
> Chapel Hill, NC 27517
> 919-445-9651
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

More information about the NumPy-Discussion mailing list