[Numpy-discussion] feedback request: proposal to add masks to the core ndarray

Matthew Brett matthew.brett@gmail....
Sat Jun 25 10:10:40 CDT 2011


Hi,

On Sat, Jun 25, 2011 at 4:05 PM, Charles R Harris
<charlesr.harris@gmail.com> wrote:
>
>
> On Sat, Jun 25, 2011 at 8:52 AM, Matthew Brett <matthew.brett@gmail.com>
> wrote:
>>
>> Hi,
>>
>> On Sat, Jun 25, 2011 at 3:46 PM, Charles R Harris
>> <charlesr.harris@gmail.com> wrote:
>> >
>> >
>> > On Sat, Jun 25, 2011 at 8:31 AM, Matthew Brett <matthew.brett@gmail.com>
>> > wrote:
>> >>
>> >> Hi,
>> >>
>> >> On Sat, Jun 25, 2011 at 3:21 PM, Charles R Harris
>> >> <charlesr.harris@gmail.com> wrote:
>> >> >
>> >> >
>> >> > On Sat, Jun 25, 2011 at 5:29 AM, Pierre GM <pgmdevlist@gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> This thread is getting quite long, innit ?
>> >> >> And I think it's getting a tad confusing, because we're mixing two
>> >> >> different concepts: missing values and masks.
>> >> >> There should be support for missing values in numpy.core, I think we
>> >> >> all
>> >> >> agree on that.
>> >> >> * What's been suggested of adding new dtypes (nafloat, naint) is
>> >> >> great,
>> >> >> by
>> >> >> why not making it the default, then ?
>> >> >>
>> >> >> * Operations involving a NA (whatever the NA actually is, depending
>> >> >> on
>> >> >> the
>> >> >> dtype of the input) should result in a NA (whatever the NA defined
>> >> >> by
>> >> >> the
>> >> >> outputs dtype). That could be done by overloading the existing
>> >> >> ufuncs
>> >> >> to
>> >> >> support the new dtypes.
>> >> >> * There should be some simple methods to retrieve the location of
>> >> >> those
>> >> >> NAs in an array. Whether we just output the indices or a full
>> >> >> boolean
>> >> >> array
>> >> >> (w/ True for a NA, False for a non-NA or vice-versa) needs to be
>> >> >> decided.
>> >> >> * We can always re-implement masked arrays to use these NAs in a way
>> >> >> which
>> >> >> would be consistent with numpy.ma (so as not to confuse existing
>> >> >> users
>> >> >> of
>> >> >> numpy.ma): a mask would be a boolean array with the same shape than
>> >> >> the
>> >> >> underlying ndarray, with True for NA.
>> >> >> Mark, I'd suggest you modify your proposal, making it clearer that
>> >> >> it's
>> >> >> not to add all of numpy.ma functionalities in the core, but just
>> >> >> support
>> >> >> these missing values. Using the term 'mask' should be avoided as
>> >> >> much
>> >> >> as
>> >> >> possible, use a 'missing data' or whatever.
>> >> >
>> >> > I think he aims to support both.
>> >>
>> >> I don't think Mark is proposing to support both.  He's proposing to
>> >> implement only array.mask.
>> >>
>> >
>> > I think you are confusing function with implementation. If you look at
>> > the
>> > current NEP, it does NA but does so by using masks behind the scene in a
>> > transparent manner.
>>
>> Yes, there is some confusion; just to be clear, I'm pointing out that
>> Mark is not proposing to implement na-dtypes, and is proposing to
>> implement array.mask.
>>
>
> I would say he is proposing to implement na-dtypes by means of masks.

But that would be confusing function with implementation.  array.mask
is the implementation with an extra mask array.  na-dtypes is the
implementation with values in the array set to special missing values.

See you,

Matthew


More information about the NumPy-Discussion mailing list