[Numpy-discussion] Warnings in numpy.ma.test()
Thu Mar 18 19:32:38 CDT 2010
On Thu, Mar 18, 2010 at 7:26 PM, Christopher Barker
> firstname.lastname@example.org wrote:
>>> I'm facing this at the moment: not a big deal, but I'm using histogram2d
>>> on some data. I just realized that it may have some NaNs in it, and I
>>> have no idea how those are being handled.
>> histogram2d handles neither masked arrays nor arrays with nans
> I really wasn't asking for help (yet) .. but thanks!
>> (array([[ 0., 0., 1.],
>> [ 0., 0., 0.],
>> [ 1., 0., 0.]]), array([ 1. , 1.66666667,
>> 2.33333333, 3. ]), array([ 1. , 1.33333333,
>> 1.66666667, 2. ]))
> I'll probably do something like that for now. I guess the question is --
> should this be built in to histogram2d (and other similar functions)?
I think yes, for all functions that are closer to actual data and
where there is an obvious way to handle the missing values. But, it's
work and adds a lot of code to a nice simple function. And if it's
just one extra line for the user, than it is not too high on my
For example, I rewrote stats.zscore a while ago to handle also
matrices and masked arrays, and Bruce rewrote geometric mean and
others, but these are easy cases, for many of the other functions it's
Also. if a function gets too much overhead, I end up rewriting and
inlining the core of the function over and over again when I need it
inside a loop, for example for optimization, or I keep a copy of the
function around that doesn't use the overhead.
I actually do little profiling, so I don't really know what the cost
would be in a loop.
> Christopher Barker, Ph.D.
> Emergency Response Division
> NOAA/NOS/OR&R (206) 526-6959 voice
> 7600 Sand Point Way NE (206) 526-6329 fax
> Seattle, WA 98115 (206) 526-6317 main reception
> NumPy-Discussion mailing list
More information about the NumPy-Discussion