[Numpy-discussion] feedback request: proposal to add masks to the core ndarray
Fri Jun 24 20:10:34 CDT 2011
On Fri, Jun 24, 2011 at 7:02 PM, Matthew Brett <email@example.com>wrote:
> On Sat, Jun 25, 2011 at 12:22 AM, Wes McKinney <firstname.lastname@example.org>
> > Perhaps we should make a wiki page someplace summarizing pros and cons
> > of the various implementation approaches?
> But - we should do this if it really is an open question which one we
> go for. If not then, we're just slowing Mark down in getting to the
> Assuming the question is still open, here's a starter for the pros and
> 1) It's easier / neater to implement
> 2) It can generalize across dtypes
> 3) You can still get the masked data underneath the mask (allowing you
> to unmask etc)
By setting up views appropriately, yes. If you don't have another view to
the underlying data, you can't get at it.
> 1) No memory overhead
> 2) Battle-tested implementation already done in R
We can't really use that though, R is GPL and NumPy is BSD. The low-level
implementation details are likely different enough that a re-implementation
would be needed anyway.
I guess we'd have to test directly whether the non-continuous memory
> of the mask and data would cause enough cache-miss problems to
> outweigh the potential cycle-savings from single byte comparisons in
The different memory buffers are each contiguous, so the access patterns
still have a lot of coherency. I intend to give the mask memory layouts
matching those of the arrays.
I guess that one and only one of these will get written. I guess that
> one of these choices may be a lot more satisfying to the current and
> future masked array itch than the other.
I'm only going to implement one solution, yes.
I'm personally worried that the memory overhead of array.masks will
> make many of us tend to avoid them. I work with images that can
> easily get large enough that I would not want an array-items size byte
> array added to my storage.
May I ask what kind of dtypes and sizes you're working with?
The reason I'm asking for more details about the implementation is
> because that is most of the argument for array.mask at the moment (1
> and 2 above).
I'm first trying to nail down more of the higher level requirements before
digging really deep into the implementation details. They greatly affect how
those details have to turn out.
> See you,
> NumPy-Discussion mailing list
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NumPy-Discussion