[Numpy-discussion] consensus (was: NA masks in the next numpy release?)

Matthew Brett matthew.brett@gmail....
Fri Oct 28 19:47:01 CDT 2011


Hi,

On Fri, Oct 28, 2011 at 4:53 PM, Benjamin Root <ben.root@ou.edu> wrote:
>
>
> On Friday, October 28, 2011, Matthew Brett <matthew.brett@gmail.com> wrote:
>> Hi,
>>
>> On Fri, Oct 28, 2011 at 4:21 PM, Ralf Gommers
>> <ralf.gommers@googlemail.com> wrote:
>>>
>>>
>>> On Sat, Oct 29, 2011 at 12:37 AM, Matthew Brett <matthew.brett@gmail.com>
>>> wrote:
>>>>
>>>> Hi,
>>>>
>>>> On Fri, Oct 28, 2011 at 3:14 PM, Charles R Harris
>>>> <charlesr.harris@gmail.com> wrote:
>>>> >
>>>> >
>>>> > On Fri, Oct 28, 2011 at 3:56 PM, Matthew Brett
>>>> > <matthew.brett@gmail.com>
>>>> > wrote:
>>>> >>
>>>> >> Hi,
>>>> >>
>>>> >> On Fri, Oct 28, 2011 at 2:43 PM, Matthew Brett
>>>> >> <matthew.brett@gmail.com>
>>>> >> wrote:
>>>> >> > Hi,
>>>> >> >
>>>> >> > On Fri, Oct 28, 2011 at 2:41 PM, Charles R Harris
>>>> >> > <charlesr.harris@gmail.com> wrote:
>>>> >> >>
>>>> >> >>
>>>> >> >> On Fri, Oct 28, 2011 at 3:16 PM, Nathaniel Smith <njs@pobox.com>
>>>> >> >> wrote:
>>>> >> >>>
>>>> >> >>> On Tue, Oct 25, 2011 at 2:56 PM, Travis Oliphant
>>>> >> >>> <oliphant@enthought.com>
>>>> >> >>> wrote:
>>>> >> >>> > I think Nathaniel and Matthew provided very
>>>> >> >>> > specific feedback that was helpful in understanding other
>>>> >> >>> > perspectives
>>>> >> >>> > of a
>>>> >> >>> > difficult problem.     In particular, I really wanted
>>>> >> >>> > bit-patterns
>>>> >> >>> > implemented.    However, I also understand that Mark did quite
>>>> >> >>> > a
>>>> >> >>> > bit
>>>> >> >>> > of
>>>> >> >>> > work
>>>> >> >>> > and altered his original designs quite a bit in response to
>>>> >> >>> > community
>>>> >> >>> > feedback.   I wasn't a major part of the pull request
>>>> >> >>> > discussion,
>>>> >> >>> > nor
>>>> >> >>> > did I
>>>> >> >>> > merge the changes, but I support Charles if he reviewed the
>>>> >> >>> > code
>>>> >> >>> > and
>>>> >> >>> > felt
>>>> >> >>> > like it was the right thing to do.  I likely would have done
>>>> >> >>> > the
>>>> >> >>> > same
>>>> >> >>> > thing
>>>> >> >>> > rather than let Mark Wiebe's work languish.
>>>> >> >>>
>>>> >> >>> My connectivity is spotty this week, so I'll stay out of the
>>>> >> >>> technical
>>>> >> >>> discussion for now, but I want to share a story.
>>>> >> >>>
>>>> >> >>> Maybe a year ago now, Jonathan Taylor and I were debating what
>>>> >> >>> the
>>>> >> >>> best API for describing statistical models would be -- whether we
>>>> >> >>> wanted something like R's "formulas" (which I supported), or
>>>> >> >>> another
>>>> >> >>> approach based on sympy (his idea). To summarize, I thought his
>>>> >> >>> API
>>>> >> >>> was confusing, pointlessly complicated, and didn't actually solve
>>>> >> >>> the
>>>> >> >>> problem; he thought R-style formulas were superficially simpler
>>>> >> >>> but
>>>> >> >>> hopelessly confused and inconsistent underneath. Now, obviously,
>>>> >> >>> I
>>>> >> >>> was
>>>> >> >>> right and he was wrong. Well, obvious to me, anyway... ;-) But it
>>>> >> >>> wasn't like I could just wave a wand and make his arguments go
>>>> >> >>> away,
>>>> >> >>> no I should point out that the implementation hasn't - as far as
>>>> >> >>> I can
>> see - changed the discussion.  The discussion was about the API.
>> Implementations are useful for agreed APIs because they can point out
>> where the API does not make sense or cannot be implemented.  In this
>> case, the API Mark said he was going to implement - he did implement -
>> at least as far as I can see.  Again, I'm happy to be corrected.
>>
>>>> In saying that we are insisting on our way, you are saying, implicitly,
>>>> 'I
>>>> am not going to negotiate'.
>>>
>>> That is only your interpretation. The observation that Mark compromised
>>> quite a bit while you didn't seems largely correct to me.
>>
>> The problem here stems from our inability to work towards agreement,
>> rather than standing on set positions.  I set out what changes I think
>> would make the current implementation OK.  Can we please, please have
>> a discussion about those points instead of trying to argue about who
>> has given more ground.
>>
>>> That commitment would of course be good. However, even if that were
>>> possible
>>> before writing code and everyone agreed that the ideas of you and
>>> Nathaniel
>>> should be implemented in full, it's still not clear that either of you
>>> would
>>> be willing to write any code. Agreement without code still doesn't help
>>> us
>>> very much.
>>
>> I'm going to return to Nathaniel's point - it is a highly valuable
>> thing to set ourselves the target of resolving substantial discussions
>> by consensus.   The route you are endorsing here is 'implementor
>> wins'.   We don't need to do it that way.  We're a mature sensible
>> bunch of adults who can talk out the issues until we agree they are
>> ready for implementation, and then implement.  That's all Nathaniel is
>> saying.  I think he's obviously right, and I'm sad that it isn't as
>> clear to y'all as it is to me.
>>
>> Best,
>>
>> Matthew
>>
>
> Everyone, can we please not do this?! I had enough of adults doing finger
> pointing back over the summer during the whole debt ceiling debate.  I think
> we can all agree that we are better than the US congress?

Yes, please.

> Forget about rudeness or decision processes.

No, that's a common mistake, which is to assume that any conversation
about things which aren't technical, is not important.   Nathaniel's
point is important.  Rudeness is important. The reason we've got into
this mess is because we clearly don't have an agreed way of making
decisions.  That's why countries and open-source projects have
constitutions, so this doesn't happen.

> I will start by saying that I am willing to separate ignore and absent, but
> only on the write side of things.  On read, I want a single way to identify
> the missing values.  I also want only a single way to perform calculations
> (either skip or propagate).

Thank you - that is very helpful.

Are you saying that you'd be OK setting missing values like this?

>>> a.mask[0:2] = False

For the read side, do you mean you're OK with this

>>> a.isna()

To identify the missing values, as is currently the case?  Or something else?

If so, then I think we're very close, it's just a discussion about names.

> An indicator of success would be that people stop using NaNs and magic
> numbers (-9999, anyone?) and we could even deprecate nansum(), or at least
> strongly suggest in its docs to use NA.

That is an excellent benchmark,

Best,

Matthew


More information about the NumPy-Discussion mailing list