[Numpy-discussion] consensus (was: NA masks in the next numpy release?)

Olivier Delalleau shish@keba...
Sat Oct 29 17:02:16 CDT 2011


2011/10/29 Ralf Gommers <ralf.gommers@googlemail.com>

>
>
> On Sat, Oct 29, 2011 at 11:36 PM, Matthew Brett <matthew.brett@gmail.com>wrote:
>
>> Hi,
>>
>> On Sat, Oct 29, 2011 at 1:48 PM, Matthew Brett <matthew.brett@gmail.com>
>> wrote:
>> > Hi,
>> >
>> > On Sat, Oct 29, 2011 at 1:44 PM, Ralf Gommers
>> > <ralf.gommers@googlemail.com> wrote:
>> >>
>> >>
>> >> On Sat, Oct 29, 2011 at 9:04 PM, Matthew Brett <
>> matthew.brett@gmail.com>
>> >> wrote:
>> >>>
>> >>> Hi,
>> >>>
>> >>> On Sat, Oct 29, 2011 at 3:26 AM, Ralf Gommers
>> >>> <ralf.gommers@googlemail.com> wrote:
>> >>> >
>> >>> >
>> >>> > On Sat, Oct 29, 2011 at 1:37 AM, Matthew Brett <
>> matthew.brett@gmail.com>
>> >>> > wrote:
>> >>> >>
>> >>> >> Hi,
>> >>> >>
>> >>> >> On Fri, Oct 28, 2011 at 4:21 PM, Ralf Gommers
>> >>> >> <ralf.gommers@googlemail.com> wrote:
>> >>> >> >
>> >>> >> >
>> >>> >> > On Sat, Oct 29, 2011 at 12:37 AM, Matthew Brett
>> >>> >> > <matthew.brett@gmail.com>
>> >>> >> > wrote:
>> >>> >> >>
>> >>> >> >> Hi,
>> >>> >> >>
>> >>> >> >> On Fri, Oct 28, 2011 at 3:14 PM, Charles R Harris
>> >>> >> >> <charlesr.harris@gmail.com> wrote:
>> >>> >> >> >>
>> >>> >> >>
>> >>> >> >> No, that's not what Nathaniel and I are saying at all. Nathaniel
>> was
>> >>> >> >> pointing to links for projects that care that everyone agrees
>> before
>> >>> >> >> they go ahead.
>> >>> >> >
>> >>> >> > It looked to me like there was a serious intent to come to an
>> >>> >> > agreement,
>> >>> >> > or
>> >>> >> > at least closer together. The discussion in the summer was going
>> >>> >> > around
>> >>> >> > in
>> >>> >> > circles though, and was too abstract and complex to follow.
>> Therefore
>> >>> >> > Mark's
>> >>> >> > choice of implementing something and then asking for feedback
>> made
>> >>> >> > sense
>> >>> >> > to
>> >>> >> > me.
>> >>> >>
>> >>> >> I should point out that the implementation hasn't - as far as I can
>> >>> >> see - changed the discussion.  The discussion was about the API.
>> >>> >>
>> >>> >> Implementations are useful for agreed APIs because they can point
>> out
>> >>> >> where the API does not make sense or cannot be implemented.  In
>> this
>> >>> >> case, the API Mark said he was going to implement - he did
>> implement -
>> >>> >> at least as far as I can see.  Again, I'm happy to be corrected.
>> >>> >
>> >>> > Implementations can also help the discussion along, by allowing
>> people
>> >>> > to
>> >>> > try out some of the proposed changes. It also allows to construct
>> >>> > examples
>> >>> > that show weaknesses, possibly to be solved by an alternative API.
>> Maybe
>> >>> > you
>> >>> > can hold the complete history of this topic in your head and
>> comprehend
>> >>> > it,
>> >>> > but for me it would be very helpful if someone said:
>> >>> > - here's my dataset
>> >>> > - this is what I want to do with it
>> >>> > - this is the best I can do with the current implementation
>> >>> > - here's how API X would allow me to solve this better or simpler
>> >>> > This can be done much better with actual data and an actual
>> >>> > implementation
>> >>> > than with a design proposal. You seem to disagree with this
>> statement.
>> >>> > That's fine. I would hope though that you recognize that concrete
>> >>> > examples
>> >>> > help people like me, and construct one or two to help us out.
>> >>> That's what use-cases are for in designing APIs.  There are examples
>> >>> of use in the NEP:
>> >>>
>> >>> https://github.com/numpy/numpy/blob/master/doc/neps/missing-data.rst
>> >>>
>> >>> the alterNEP:
>> >>>
>> >>> https://gist.github.com/1056379
>> >>>
>> >>> and my longer email to Travis:
>> >>>
>> >>>
>> >>>
>> http://article.gmane.org/gmane.comp.python.numeric.general/46544/match=ignored
>> >>>
>> >>> Mark has done a nice job of documentation:
>> >>>
>> >>> http://docs.scipy.org/doc/numpy/reference/arrays.maskna.html
>> >>>
>> >>> If you want to understand what the alterNEP case is, I'd suggest the
>> >>> email, just because it's the most recent and I think the terminology
>> >>> is slightly clearer.
>> >>>
>> >>> Doing the same examples on a larger array won't make the point easier
>> >>> to understand.  The discussion is about what the right concepts are,
>> >>> and you can help by looking at the snippets of code in those
>> >>> documents, and deciding for yourself whether you think the current
>> >>> masking / NA implementation seems natural and easy to explain, or
>> >>> rather forced and difficult to explain, and then email back trying to
>> >>> explain your impression (which is not always easy).
>> >>
>> >> If you seriously believe that looking at a few snippets is as helpful
>> and
>> >> instructive as being able to play around with them in IPython and
>> modify
>> >> them, then I guess we won't make progress in this part of the
>> discussion.
>> >> You're just telling me to go back and re-read things I'd already read.
>> >
>> > The snippets are in ipython or doctest format - aren't they?
>>
>> Oops - 10 minute rule.  Now I see that you mean that you can't
>> experiment with the alternative implementation without working code.
>>
>
> Indeed.
>
>
>> That's true, but I am hoping that the difference between - say:
>>
>> a[0:2] = np.NA
>>
>> and
>>
>> a.mask[0:2] = False
>>
>> would be easy enough to imagine.
>
>
> It is in this case. I agree the explicit ``a.mask`` is clearer. This is a
> quite specific point that could be improved in the current implementation.
> It doesn't require ripping everything out.
>
> Ralf
>

I haven't been following the discussion closely, but wouldn't it be instead:
a.mask[0:2] = True?

It's something that I actually find a bit difficult to get right in the
current numpy.ma implementation: I would find more intuitive to have True
for "valid" data, and False for invalid / missing / ... I realize how the
implementation makes sense (and is appropriate given that the name is
"mask"), but I just thought I'd point this out... even if it's just me ;)

-=- Olivier
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20111029/b82b81e1/attachment-0001.html 


More information about the NumPy-Discussion mailing list