[Numpy-discussion] feedback request: proposal to add masks to the core ndarray

Mark Wiebe mwwiebe@gmail....
Fri Jun 24 14:46:41 CDT 2011


On Fri, Jun 24, 2011 at 1:18 PM, Matthew Brett <matthew.brett@gmail.com>wrote:

> Hi,
>
> On Fri, Jun 24, 2011 at 5:45 PM, Mark Wiebe <mwwiebe@gmail.com> wrote:
> > On Fri, Jun 24, 2011 at 6:59 AM, Matthew Brett <matthew.brett@gmail.com>
> > wrote:
> >>
> >> Hi,
> >>
> >> On Fri, Jun 24, 2011 at 2:32 AM, Nathaniel Smith <njs@pobox.com> wrote:
> ...
> >> and the fact that 'missing_value' could be any type would make the
> >> code more complicated than the current case where the mask is always
> >> bools or something?
> >
> > I'm referring to the underlying C implementations of the dtypes and any
> > additional custom dtypes that people create. With the masked approach,
> you
> > implement a new custom data type in C, and it automatically works with
> > missing data. With the custom dtype approach, you have to do a lot more
> > error-prone work to handle the special values in all the ufuncs.
>
> This is just pure ignorance on my part, but I can see that the ufuncs
> need to handle the missing values, but I can't see immediately why
> that will be much more complicated than the 'every array might have a
> mask' implementation.  This was what I was trying to say with my silly
> sketch:
>
> missing_value = np.dtype.missing_value
>
> for e in oned_array:
>     if e == missing_value:
>
> well - you get the idea.  Obviously this is what you've been thinking
> about, I just wanted to get a grasp of where the extra complexity is
> coming from compared to:
>
> for i, e in enumerate(one_d_array):
>     if one_d_array.mask[i] == False:
>

With the masked approach, I plan to add a default mask support mechanism to
the ufunc which calls the unmasked loop on the chunks that it can, then give
the ability for a specific ufunc to provide faster mask-aware inner loops.
Doing this in a general way for individual na* dtypes would require more
special cases to handle different dtype sizes and specifics.

-Mark


>
> Cheers,
>
> Matthew
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20110624/404fc29e/attachment.html 


More information about the NumPy-Discussion mailing list