[Numpy-discussion] NA masks in the next numpy release?
Fri Oct 28 16:01:52 CDT 2011
On Fri, Oct 28, 2011 at 1:52 PM, Benjamin Root <email@example.com> wrote:
> On Fri, Oct 28, 2011 at 3:22 PM, Matthew Brett <firstname.lastname@example.org>
>> On Fri, Oct 28, 2011 at 1:14 PM, Benjamin Root <email@example.com> wrote:
>> > On Fri, Oct 28, 2011 at 3:02 PM, Matthew Brett <firstname.lastname@example.org>
>> > wrote:
>> >> You and I know that I've got an array with values [99, 100, 3] and a
>> >> mask with values [False, False, True]. So maybe I'd like to see what
>> >> happens if I take off the mask from the second value. I know that's
>> >> what I want to do, but I don't know how to do it, because you won't
>> >> let me manipulate the mask, because I'm not allowed to know that the
>> >> NA values come from the mask.
>> >> The alterNEP is just saying - please - be straight with me. If
>> >> you're doing masking, show me the mask, and don't try and hide that
>> >> there are stored values underneath.
>> > Considering that you have admitted before to not regularly using masked
>> > arrays, I seriously doubt that you would be able to judge whether this
>> > is a
>> > significant detriment or not. My entire point that I have been making
>> > is
>> > that Mark's implementation is not the same as the current masked arrays.
>> > Instead, it is a cleaner, more mature implementation that gets rid of
>> > extraneous "features".
>> This may explain why we don't seem to be getting anywhere. I am sure
>> that Mark's implementation of masking is great. We're not talking
>> about that. We're talking about whether it's a good idea to make
>> masking look as though it is implementing the ABSENT idea. That's
>> what I think is confusing, and that's the conversation I have been
>> trying to pursue.
> Sorry if I came across too strongly there. No disrespect was intended.
I wasn't worried about the disrespect. It's just I feel the
discussion has not been to the point.
> Personally, I think we are getting somewhere. We have been whittling away
> what it is that we do agree upon, and have begun to specify *exactly* what
> it is that we disagree on. I have understand your concern, and -- like I
> said in my previous email -- it makes sense from the perspective of numpy.ma
> users have had up to now.
But I'm not a numpy.ma user, I'm just someone who knows that what you
are doing is masking out values. The fact that I do not use numpy.ma
points out that it's possible to find this highly counter-intuitive
without prior bias.
> But, I re-raise my point that I have been making
> about the need to re-think masked arrays. If we consider masks as advanced
> slicing or boolean indexing, then being unable to access the underlying
> values actually makes a lot of sense.
> Consider it a contract when I pass a set of data with only certain values
> exposed. Because I passed the data with only those values exposed, then it
> must have been entirely my intention to let the function know of only those
> values. It would be a violation of that contract if the function obtained
> those masked values. If I want to communicate both the original values and
> a particular mask, then I pass the array and a view with a particular mask.
This is the old discussion about what Python users expect. I think
they expect to be treated as adults. That is, breaking the contract
should not be easy to do by accident, but it should be allowed.
> Maybe it would be helpful that an array can never have its own mask, but
> rather, only views can carry masks?
> In conclusion, I submit that this is largely a problem that can be solved
> with the proper documentation. New users who never used numpy.ma before do
> not have to concern themselves with the old way of thinking and are just
> simply taught what masked arrays "are". Meanwhile, a special section of the
> documentation should be made that teaches numpy.ma users how masked arrays
> "should be".
I don't think documentation will solve it. In a way, the ideal user
is someone who doesn't know what's going on, because, for a while,
they may not realize that when they thought they were doing
assignment, in fact they are doing masking. Unfortunately, I suspect
almost everyone using these things will start to realize that, and
then they will start getting confused. I find it confusing, and I
believe myself to understand the issues pretty well, and be of
numpy-user-range comprehension powers.
More information about the NumPy-Discussion