[Numpy-discussion] consensus (was: NA masks in the next numpy release?)

Matthew Brett matthew.brett@gmail....
Sat Oct 29 15:28:22 CDT 2011


Hi,

On Sat, Oct 29, 2011 at 12:41 PM, Charles R Harris
<charlesr.harris@gmail.com> wrote:
>
>
> On Sat, Oct 29, 2011 at 1:26 PM, Matthew Brett <matthew.brett@gmail.com>
> wrote:
>>
>> Hi,
>>
>> On Sat, Oct 29, 2011 at 12:19 PM, Charles R Harris
>> <charlesr.harris@gmail.com> wrote:
>> >
>> >
>> > On Sat, Oct 29, 2011 at 1:04 PM, Matthew Brett <matthew.brett@gmail.com>
>> > wrote:
>> >>
>> >> Hi,
>> >>
>> >> On Sat, Oct 29, 2011 at 3:26 AM, Ralf Gommers
>> >> <ralf.gommers@googlemail.com> wrote:
>> >> >
>> >> >
>> >> > On Sat, Oct 29, 2011 at 1:37 AM, Matthew Brett
>> >> > <matthew.brett@gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> Hi,
>> >> >>
>> >> >> On Fri, Oct 28, 2011 at 4:21 PM, Ralf Gommers
>> >> >> <ralf.gommers@googlemail.com> wrote:
>> >> >> >
>> >> >> >
>> >> >> > On Sat, Oct 29, 2011 at 12:37 AM, Matthew Brett
>> >> >> > <matthew.brett@gmail.com>
>> >> >> > wrote:
>> >> >> >>
>> >> >> >> Hi,
>> >> >> >>
>> >> >> >> On Fri, Oct 28, 2011 at 3:14 PM, Charles R Harris
>> >> >> >> <charlesr.harris@gmail.com> wrote:
>> >> >> >> >>
>> >> >> >>
>> >> >> >> No, that's not what Nathaniel and I are saying at all. Nathaniel
>> >> >> >> was
>> >> >> >> pointing to links for projects that care that everyone agrees
>> >> >> >> before
>> >> >> >> they go ahead.
>> >> >> >
>> >> >> > It looked to me like there was a serious intent to come to an
>> >> >> > agreement,
>> >> >> > or
>> >> >> > at least closer together. The discussion in the summer was going
>> >> >> > around
>> >> >> > in
>> >> >> > circles though, and was too abstract and complex to follow.
>> >> >> > Therefore
>> >> >> > Mark's
>> >> >> > choice of implementing something and then asking for feedback made
>> >> >> > sense
>> >> >> > to
>> >> >> > me.
>> >> >>
>> >> >> I should point out that the implementation hasn't - as far as I can
>> >> >> see - changed the discussion.  The discussion was about the API.
>> >> >>
>> >> >> Implementations are useful for agreed APIs because they can point
>> >> >> out
>> >> >> where the API does not make sense or cannot be implemented.  In this
>> >> >> case, the API Mark said he was going to implement - he did implement
>> >> >> -
>> >> >> at least as far as I can see.  Again, I'm happy to be corrected.
>> >> >
>> >> > Implementations can also help the discussion along, by allowing
>> >> > people
>> >> > to
>> >> > try out some of the proposed changes. It also allows to construct
>> >> > examples
>> >> > that show weaknesses, possibly to be solved by an alternative API.
>> >> > Maybe
>> >> > you
>> >> > can hold the complete history of this topic in your head and
>> >> > comprehend
>> >> > it,
>> >> > but for me it would be very helpful if someone said:
>> >> > - here's my dataset
>> >> > - this is what I want to do with it
>> >> > - this is the best I can do with the current implementation
>> >> > - here's how API X would allow me to solve this better or simpler
>> >> > This can be done much better with actual data and an actual
>> >> > implementation
>> >> > than with a design proposal. You seem to disagree with this
>> >> > statement.
>> >> > That's fine. I would hope though that you recognize that concrete
>> >> > examples
>> >> > help people like me, and construct one or two to help us out.
>> >> That's what use-cases are for in designing APIs.  There are examples
>> >> of use in the NEP:
>> >>
>> >> https://github.com/numpy/numpy/blob/master/doc/neps/missing-data.rst
>> >>
>> >> the alterNEP:
>> >>
>> >> https://gist.github.com/1056379
>> >>
>> >> and my longer email to Travis:
>> >>
>> >>
>> >>
>> >> http://article.gmane.org/gmane.comp.python.numeric.general/46544/match=ignored
>> >>
>> >> Mark has done a nice job of documentation:
>> >>
>> >> http://docs.scipy.org/doc/numpy/reference/arrays.maskna.html
>> >>
>> >> If you want to understand what the alterNEP case is, I'd suggest the
>> >> email, just because it's the most recent and I think the terminology
>> >> is slightly clearer.
>> >>
>> >> Doing the same examples on a larger array won't make the point easier
>> >> to understand.  The discussion is about what the right concepts are,
>> >> and you can help by looking at the snippets of code in those
>> >> documents, and deciding for yourself whether you think the current
>> >> masking / NA implementation seems natural and easy to explain, or
>> >> rather forced and difficult to explain, and then email back trying to
>> >> explain your impression (which is not always easy).
>> >>
>> >> >> >> In saying that we are insisting on our way, you are saying,
>> >> >> >> implicitly,
>> >> >> >> 'I
>> >> >> >> am not going to negotiate'.
>> >> >> >
>> >> >> > That is only your interpretation. The observation that Mark
>> >> >> > compromised
>> >> >> > quite a bit while you didn't seems largely correct to me.
>> >> >>
>> >> >> The problem here stems from our inability to work towards agreement,
>> >> >> rather than standing on set positions.  I set out what changes I
>> >> >> think
>> >> >> would make the current implementation OK.  Can we please, please
>> >> >> have
>> >> >> a discussion about those points instead of trying to argue about who
>> >> >> has given more ground.
>> >> >>
>> >> >> > That commitment would of course be good. However, even if that
>> >> >> > were
>> >> >> > possible
>> >> >> > before writing code and everyone agreed that the ideas of you and
>> >> >> > Nathaniel
>> >> >> > should be implemented in full, it's still not clear that either of
>> >> >> > you
>> >> >> > would
>> >> >> > be willing to write any code. Agreement without code still doesn't
>> >> >> > help
>> >> >> > us
>> >> >> > very much.
>> >> >>
>> >> >> I'm going to return to Nathaniel's point - it is a highly valuable
>> >> >> thing to set ourselves the target of resolving substantial
>> >> >> discussions
>> >> >> by consensus.   The route you are endorsing here is 'implementor
>> >> >> wins'.
>> >> >
>> >> > I'm not. All I want to point out is is that design and implementation
>> >> > are
>> >> > not completely separated either.
>> >>
>> >> No, they often interact.  I was trying to explain why, in this case,
>> >> the implementation hasn't changed the issues substantially, as far as
>> >> I can see.   If you think otherwise, then that is helpful information,
>> >> because you can feed back about where the initial discussion has been
>> >> overtaken by the implementation, and so we can strip down the
>> >> discussion to its essential parts.
>> >>
>> >> >> We don't need to do it that way.  We're a mature sensible
>> >> >> bunch of adults
>> >> >
>> >> > Agreed:)
>> >>
>> >> Ah - if only it was that easy :)
>> >>
>> >> >> who can talk out the issues until we agree they are
>> >> >> ready for implementation, and then implement.
>> >> >
>> >> > The history of this discussion doesn't suggest it straightforward to
>> >> > get
>> >> > a
>> >> > design right first time. It's a complex subject.
>> >>
>> >> Right - and it's more complex when only some of the people involved
>> >> are interested in the discussion coming to a resolution.   That's
>> >> Nathaniel's point - that although it seems inefficient, working
>> >> towards a good resolution of big issues like this is very valuable in
>> >> getting the ideas right.
>> >>
>> >> > The second part of your statement, "and then implement", sounds so
>> >> > simple.
>> >> > The reality is that there are only a handful of developers who have
>> >> > done
>> >> > a
>> >> > significant amount of work on the numpy core in the last two years. I
>> >> > haven't seen anyone saying they are planning to implement (part of)
>> >> > whatever
>> >> > design the outcome of this discussion will be. I don't think it's
>> >> > strange to
>> >> > keep this in mind to some extent.
>> >>
>> >> No, but consensus building is a little bit all or none.   I guess we'd
>> >> all like consensus, but then sometimes, as Nathaniel points out, it is
>> >> inconvenient and annoying.  If we have no stated commitment to
>> >> consensus, at some unpredictable point in the discussion, those who
>> >> are implementing will - obviously - just duck out and do the
>> >> implementation.  I would do that, I guess.  Maybe I have done in the
>> >> projects I'm involved in.   The question Nathaniel is raising, and me
>> >> too, in a less coherent way, is - is that fine?    Does it matter that
>> >> we are short-cutting through substantial discussions?   Is that really
>> >> - in the long term - a more efficient way of building both the code
>> >> and the community?
>> >>
>> >
>> > Who is counted in building a consensus? I tend to pay attention to those
>> > who
>> > have made consistent contributions over the years, reviewed code, fixed
>> > bugs, and have generally been active in numpy development. In any group
>> > participation is important, people who just walk in the door and demand
>> > things be done their way aren't going to get a lot of respect. I'll
>> > happily
>> > listen to politely expressed feedback, especially if the feedback comes
>> > from
>> > someone who shows up to work, but that hasn't been my impression of the
>> > disagreements in this case. Heck, Nathaniel wasn't even tracking the
>> > Numpy
>> > pull requests or Mark's repository. That doesn't spell "participant" in
>> > my
>> > dictionary.
>>
>> I'm sorry, I am not obeying Ben's 10 minute rule.
>>
>> This is a very important point you are making, which is that those who
>> write the code have the final say.
>>
>> Is it fair to say that your responses show that you don't think either
>> Nathaniel or I have much of a say?
>>
>> It's fair to say I haven't contributed much code to numpy.
>>
>
> But you have contributed some, which immediately gives you more credibility.
>
>>
>> I could imagine some sort of voting system for which the voting is
>> weighted by lines of code contributed.
>
> Mark has been the man over the last year. By comparison, the rest of us have
> just been diddling around.
>
>>
>> I suspect you are thinking of an implicit version of such a system,
>> continuously employed.
>>
>> But Nathaniel's point is that other projects have gone out of their
>> way to avoid voting.  To quote from:
>>
>> http://producingoss.com/en/consensus-democracy.html
>>
>> "In general, taking a vote should be very rare—a last resort for when
>> all other options have failed. Don't think of voting as a great way to
>> resolve debates. It isn't. It ends discussion, and thereby ends
>> creative thinking about the problem. As long as discussion continues,
>> there is the possibility that someone will come up with a new solution
>> everyone likes. "
>>
>
> As Ralf pointed out, the core developers are a small handful at the moment.
> Now in one sense that presents an opportunity: anyone who has the time and
> inclination to contribute code and review pull requests is going to make an
> impact and rapidly gain influence. In a sense, leadership in the numpy
> community is up for grabs. But before you can claim the kingdom, there is
> the small matter of completing a quest or two.

Yes, this is well-put - but I think I am asking for a less feudal
model of decision making.

The model you are offering is one of power - where power is acquired
by code contributions.  I suppose this model is attractive if you
don't believe that it is generally possible to achieve an agreed
solution through general and open discussion.

The more effective model is democratic, that is, we have faith in each
other to be reasonable and to negotiate in the best interests of the
project, and we use measures of influence as an absolute last resort,
and even then, this influence should be determined on explicit grounds
(such as agreement across the group, number of lines committed or some
other thing).

Best,

Matthew


More information about the NumPy-Discussion mailing list