[Numpy-discussion] consensus (was: NA masks in the next numpy release?)

Matthew Brett matthew.brett@gmail....
Sat Oct 29 15:48:47 CDT 2011


Hi,

On Sat, Oct 29, 2011 at 1:44 PM, Ralf Gommers
<ralf.gommers@googlemail.com> wrote:
>
>
> On Sat, Oct 29, 2011 at 9:04 PM, Matthew Brett <matthew.brett@gmail.com>
> wrote:
>>
>> Hi,
>>
>> On Sat, Oct 29, 2011 at 3:26 AM, Ralf Gommers
>> <ralf.gommers@googlemail.com> wrote:
>> >
>> >
>> > On Sat, Oct 29, 2011 at 1:37 AM, Matthew Brett <matthew.brett@gmail.com>
>> > wrote:
>> >>
>> >> Hi,
>> >>
>> >> On Fri, Oct 28, 2011 at 4:21 PM, Ralf Gommers
>> >> <ralf.gommers@googlemail.com> wrote:
>> >> >
>> >> >
>> >> > On Sat, Oct 29, 2011 at 12:37 AM, Matthew Brett
>> >> > <matthew.brett@gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> Hi,
>> >> >>
>> >> >> On Fri, Oct 28, 2011 at 3:14 PM, Charles R Harris
>> >> >> <charlesr.harris@gmail.com> wrote:
>> >> >> >>
>> >> >>
>> >> >> No, that's not what Nathaniel and I are saying at all. Nathaniel was
>> >> >> pointing to links for projects that care that everyone agrees before
>> >> >> they go ahead.
>> >> >
>> >> > It looked to me like there was a serious intent to come to an
>> >> > agreement,
>> >> > or
>> >> > at least closer together. The discussion in the summer was going
>> >> > around
>> >> > in
>> >> > circles though, and was too abstract and complex to follow. Therefore
>> >> > Mark's
>> >> > choice of implementing something and then asking for feedback made
>> >> > sense
>> >> > to
>> >> > me.
>> >>
>> >> I should point out that the implementation hasn't - as far as I can
>> >> see - changed the discussion.  The discussion was about the API.
>> >>
>> >> Implementations are useful for agreed APIs because they can point out
>> >> where the API does not make sense or cannot be implemented.  In this
>> >> case, the API Mark said he was going to implement - he did implement -
>> >> at least as far as I can see.  Again, I'm happy to be corrected.
>> >
>> > Implementations can also help the discussion along, by allowing people
>> > to
>> > try out some of the proposed changes. It also allows to construct
>> > examples
>> > that show weaknesses, possibly to be solved by an alternative API. Maybe
>> > you
>> > can hold the complete history of this topic in your head and comprehend
>> > it,
>> > but for me it would be very helpful if someone said:
>> > - here's my dataset
>> > - this is what I want to do with it
>> > - this is the best I can do with the current implementation
>> > - here's how API X would allow me to solve this better or simpler
>> > This can be done much better with actual data and an actual
>> > implementation
>> > than with a design proposal. You seem to disagree with this statement.
>> > That's fine. I would hope though that you recognize that concrete
>> > examples
>> > help people like me, and construct one or two to help us out.
>> That's what use-cases are for in designing APIs.  There are examples
>> of use in the NEP:
>>
>> https://github.com/numpy/numpy/blob/master/doc/neps/missing-data.rst
>>
>> the alterNEP:
>>
>> https://gist.github.com/1056379
>>
>> and my longer email to Travis:
>>
>>
>> http://article.gmane.org/gmane.comp.python.numeric.general/46544/match=ignored
>>
>> Mark has done a nice job of documentation:
>>
>> http://docs.scipy.org/doc/numpy/reference/arrays.maskna.html
>>
>> If you want to understand what the alterNEP case is, I'd suggest the
>> email, just because it's the most recent and I think the terminology
>> is slightly clearer.
>>
>> Doing the same examples on a larger array won't make the point easier
>> to understand.  The discussion is about what the right concepts are,
>> and you can help by looking at the snippets of code in those
>> documents, and deciding for yourself whether you think the current
>> masking / NA implementation seems natural and easy to explain, or
>> rather forced and difficult to explain, and then email back trying to
>> explain your impression (which is not always easy).
>
> If you seriously believe that looking at a few snippets is as helpful and
> instructive as being able to play around with them in IPython and modify
> them, then I guess we won't make progress in this part of the discussion.
> You're just telling me to go back and re-read things I'd already read.

The snippets are in ipython or doctest format - aren't they?

> OK, update: I took Ben's 10 minutes to go back and read the reference doc
> and your email again, just in case. The current implementation still seems
> natural to me to explain. It fits my use-cases. Perhaps that's different for
> you because you and I deal with different kinds of data. I don't have to
> explicitly treat absent and ignored data differently; those two are actually
> mixed and indistinguishable already in much of my data. Therefore the
> current implementation works well for me, having to make a distinction would
> be a needless complication.

OK - I'm not sure that contributes much to the discussion, because the
problem is being able to explain to each other in details why one
solution is preferable to another.  To follow your own advice, you'd
post some code snippets showing how you'd see the two ideas playing
out and why one is clearer than the other.

Best,

Matthew


More information about the NumPy-Discussion mailing list