[Numpy-discussion] feedback request: proposal to add masks to the core ndarray
Fri Jun 24 14:46:41 CDT 2011
On Fri, Jun 24, 2011 at 1:18 PM, Matthew Brett <email@example.com>wrote:
> On Fri, Jun 24, 2011 at 5:45 PM, Mark Wiebe <firstname.lastname@example.org> wrote:
> > On Fri, Jun 24, 2011 at 6:59 AM, Matthew Brett <email@example.com>
> > wrote:
> >> Hi,
> >> On Fri, Jun 24, 2011 at 2:32 AM, Nathaniel Smith <firstname.lastname@example.org> wrote:
> >> and the fact that 'missing_value' could be any type would make the
> >> code more complicated than the current case where the mask is always
> >> bools or something?
> > I'm referring to the underlying C implementations of the dtypes and any
> > additional custom dtypes that people create. With the masked approach,
> > implement a new custom data type in C, and it automatically works with
> > missing data. With the custom dtype approach, you have to do a lot more
> > error-prone work to handle the special values in all the ufuncs.
> This is just pure ignorance on my part, but I can see that the ufuncs
> need to handle the missing values, but I can't see immediately why
> that will be much more complicated than the 'every array might have a
> mask' implementation. This was what I was trying to say with my silly
> missing_value = np.dtype.missing_value
> for e in oned_array:
> if e == missing_value:
> well - you get the idea. Obviously this is what you've been thinking
> about, I just wanted to get a grasp of where the extra complexity is
> coming from compared to:
> for i, e in enumerate(one_d_array):
> if one_d_array.mask[i] == False:
With the masked approach, I plan to add a default mask support mechanism to
the ufunc which calls the unmasked loop on the chunks that it can, then give
the ability for a specific ufunc to provide faster mask-aware inner loops.
Doing this in a general way for individual na* dtypes would require more
special cases to handle different dtype sizes and specifics.
> NumPy-Discussion mailing list
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NumPy-Discussion