[Numpy-discussion] Re: [SciPy-user] Messing with missing values
pgmdevlist at mailcan.com
pgmdevlist at mailcan.com
Sun Feb 26 21:20:03 CST 2006
On Sunday 26 February 2006 14:19, Sasha wrote:
> I am replying on "numpy-discussion" because this is really a numpy
> rather than scipy topic.
My bad, sorry for that.
> > Unfortunately, most of the numpy/scipy functions don't handle missing
> > values nicely.
>
> Can you specify which *numpy* functions are giving you trouble?
> That should be fixed.
Typical examples: median, stdev, diff... `stdev` is obvious, `median`
straightforward for 1d arrays (and I'm still looking for an optimal method
for higher dimension). The couple of `shape_base` functions I tried
(`hstack`, `column_stack`..) required to fill the array beforehand, and
superimposing the corresponding mask.
Or even some methods such as `ndim` (more for convenience than anything, a
`len(x.shape)` does the trick for both masked & unmasked versions), or r_[].
I remmbr a message a couple of weeks ago wondering whether ma should be kpet
uptodate with the rest of numpy (and of course, I can't find the reference
right now). What's the status on ma ?
> > How could I mask the values corresponding to
> > MA.masked in the final list, without having to check every single
> > element?
>
> Latest ma allows you to pass masked arrays directly to ufuncs. In
> order for this to work a ufunc should be registered in the "domains"
> and "fills" dictionaries. Not much documentation on this feature
> exists yet, so you will have to read the code in ma.py to figure this
> out.
Let's take the `median` example for 2D arrays. I end up with something like:
---
med = []
for x_i in x:
med.append(median1d(x_i.compressed())
---
with `median1d` a slightly modified version of the basic numpy `median`,
outputing `MA.masked` if `x_i.compressed()` is `None`. I need the `med` list
to be a masked_array. Paul Dubois suggests:
---
return ma.array(med, mask=[x is ma.masked for x in med])
---
I guess that's more efficient than the
---
return MA.masked_values(med.filled(nodata),nodata)
---
I had come up with. AAMOF, it seems even faster to hardcode the `median1d`
part in the loop.
But yes, I gonna check the sources for the ufunc.
Thanks again.
--
Pierre GM
More information about the Numpy-discussion
mailing list