[Numpy-discussion] NA masks in the next numpy release?
Charles R Harris
Fri Oct 28 15:27:42 CDT 2011
2011/10/28 Stéfan van der Walt <firstname.lastname@example.org>
> On Fri, Oct 28, 2011 at 12:47 PM, Benjamin Root <email@example.com> wrote:
> > 2011/10/28 Stéfan van der Walt <firstname.lastname@example.org>
> >> The
> >> implementation as it stands essentially gives us a faster and more
> >> integrated version of numpy.ma; but it has become clear from this
> >> conversation that such an approach overlooks a very common subset of
> >> masked-related problems.
> > Which are...? (given the history of this discussion, let's not assume
> > anything is clear).
> The case where the number of elements in the array vastly outnumbers
> the number of masked elements. (Images, 3D volumes, large
> time-series, tables, etc.)
> E.g., if you are taking measurements from a sensor, but once in a blue
> moon the sensor messes up, you simply want to mark those values as
> missing, but you do not want to allocate a whole new chunk of memory
> to do so.
> I had a chat with JB Poline this morning, who mentioned that sparse
> matrix storage of the mask may also be an option. Those containers
> typically trade off insertion vs. lookup speeds, so I'm not sure
> whether it'd be feasible, but I like the idea.
I think simple run length encoding might work well with masks.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NumPy-Discussion