[Numpy-discussion] NetCDF4/numpy question

Howard howard@renci....
Fri Jan 27 16:37:35 CST 2012


On 1/27/12 5:21 PM, Eric Firing wrote:
> On 01/27/2012 11:18 AM, Howard wrote:
>> Hi all
>>
>> I am a fairly recent convert to python and I have got a question that's
>> got me stumped. I hope this is the right mailing list: here goes :)
>>
>> I am reading some time series data out of a netcdf file a single
>> timestep at a time. If the data is NaN, I want to reset it to the
>> minimum of the dataset over all timesteps (which I already know). The
>> data is in a variable of type numpy.ma.core.MaskedArray called modelData.
>>
>> If I do this:
>>
>> for i in range(len(modelData)):
>> if math.isnan(modelData[i]):
>> modelData[i] = dataMin
>>
>> I get the effect I want, If I do this:
>>
>> modelData[np.isnan(modelData)] = dataMin
>>
>> it doesn't seem to be working. Of course I could just do the first one,
>> but len(modelData) is about 3.5 million, and it's taking about 20
>> seconds to run. This is happening inside of a rendering loop, so I'd
>> like it to be as fast as possible, and I thought the second one might be
>> faster, and maybe it is, but it doesn't seem to be working! :)
> It would help if you would say explicitly what you mean by "doesn't seem
> to be working", ideally by providing a minimal complete example
> illustrating the problem.
Hi Eric

Thanks for the reply.  Yes, I can be a little more specific about the 
issue.  I am reading data from a storm surge model out of a NetCDF file 
so I can render it with tricontourf. The model data has both a 
triangulation and a set of lat, lon points that are invariant for the 
entire model run, as well as data for each time step. As the model runs, 
triangles in the coastal plain wet and dry: the dry values are indicated 
by NaN values in the data and should not be rendered.  Those I mask off 
previous to this code. I have found, in using tricontourf, that in the 
mapping from data values to color values, the range of the data seems to 
include even the data from the masked triangles.  This causes the data 
to be either monochromatic or bi-chromatic (the high and low colors in 
the map).  However, once the triangles are masked, if I set the 
corresponding data values to the known dataMin (or in fact, any value in 
the valid data range) the render proceeds correctly.  So in the case of 
the first piece of code, I get reasonable images: using the second I do not.

>
> Does modelData have masked values that you want to keep separate from
> your NaN values?  If not, you can do this:

No I don't think so.
>
> y = np.ma.masked_invalid(modelData).filled(dataMin)
>
> Then y will be an ordinary ndarray.  If this is not satisfactory because
> you need to keep separate some initially masked values, then you may
> need to save the initial mask and use it to turn y back into a masked array.
>
> You may be running into trouble with your initial approach because using
> np.isnan on a masked array is giving a masked array, and I think trying
> to index with a masked array is not advised.
This could certainly be be the issue. I will look into this Monday.

Thanks very much for taking the time to reply.
Howard

>
> In [2]: np.isnan(np.ma.array([1.0, np.nan, 2.0], mask=[False, False, True]))
> Out[2]:
> masked_array(data = [False True --],
>                mask = [False False  True],
>          fill_value = True)
>
> Eric
>
>> Any ideas would be much appreciated.
>>
>> Thanks
>> Howard
>>
>> --
>> Howard Lander<mailto:howard@renci.org>
>> Senior Research Software Developer
>> Renaissance Computing Institute (RENCI)<http://www.renci.org>
>> The University of North Carolina at Chapel Hill
>> Duke University
>> North Carolina State University
>> 100 Europa Drive
>> Suite 540
>> Chapel Hill, NC 27517
>> 919-445-9651
>>
>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


-- 
Howard Lander <mailto:howard@renci.org>
Senior Research Software Developer
Renaissance Computing Institute (RENCI) <http://www.renci.org>
The University of North Carolina at Chapel Hill
Duke University
North Carolina State University
100 Europa Drive
Suite 540
Chapel Hill, NC 27517
919-445-9651
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20120127/d89bdb08/attachment.html 


More information about the NumPy-Discussion mailing list