[Numpy-discussion] Warnings in numpy.ma.test()
Christopher Barker
Chris.Barker@noaa....
Thu Mar 18 14:46:21 CDT 2010
Gael Varoquaux wrote:
> On Thu, Mar 18, 2010 at 12:12:10PM -0700, Christopher Barker wrote:
>> sure -- that's kind of my point -- if EVERY numpy array were
>> (potentially) masked, then folks would write code to deal with them
>> appropriately.
>
> That's pretty much saying: "I have a complicated problem and I want every
> one else to have to deal with the full complexity of it, even if they
> have a simple problem".
Well -- I did say it was a fantasy...
But I disagree -- having invalid data is a very common case. What we
have now is a situation where we have two parallel systems, masked
arrays and regular arrays. Each time someone does something new with
masked arrays, they often find another missing feature, and have to
solve that. Also, the fact that masked arrays are tacked on means that
performance suffers.
Maybe it would simply be too ugly, but If I were to start from the
ground up with a scientific computing package, I would want to put in
support for missing values from that start.
There are some cases where is it simply too complicated or to expensive
to handle missing values -- fine, then an exception is raised.
You may be right about how complicated it would be, and what would
happen is that everyone would simply put a:
if a.masked:
raise ("I can't deal with masked dat")
stanza at the top of every new method they wrote, but I suspect that if
the core infrastructure was in place, it would be used.
I'm facing this at the moment: not a big deal, but I'm using histogram2d
on some data. I just realized that it may have some NaNs in it, and I
have no idea how those are being handled. I also want to move to masked
arrays and have no idea if histogram2d can deal with those. At the
least, I need to do some testing, and I suspect I'll need to do some
hacking on histogram2d (or just write my own).
I'm sure I'm not the only one in the world that needs to histogram some
data that may have invalid values -- so wouldn't it be nice if that were
already handled in a defined way?
-Chris
--
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@noaa.gov
More information about the NumPy-Discussion
mailing list