[Numpy-discussion] What is consensus anyway

Nathaniel Smith njs@pobox....
Mon Apr 23 14:33:27 CDT 2012

On Mon, Apr 23, 2012 at 1:04 AM, Charles R Harris
<charlesr.harris@gmail.com> wrote:
> On Sun, Apr 22, 2012 at 4:15 PM, Nathaniel Smith <njs@pobox.com> wrote:
>> If you hang around big FOSS projects, you'll see the word "consensus"
>> come up a lot. For example, the glibc steering committee recently
>> dissolved itself in favor of governance "directly by the consensus of
>> the people active in glibc development"[1]. It's the governing rule of
>> the IETF, which defines many of the most important internet
>> standards[2]. It is the "primary way decisions are made on
>> Wikipedia"[3]. It's "one of the fundamental aspects of accomplishing
>> things within the Apache framework"[4].
>> [1] https://lwn.net/Articles/488778/
>> [2] https://www.ietf.org/tao.html#getting.things.done
>> [3] https://en.wikipedia.org/wiki/Wikipedia:Consensus
>> [4] https://www.apache.org/foundation/voting.html
>> But it turns out that this "consensus" thing is actually somewhat
>> mysterious, and one that most programmers immersed in this culture
>> pick it up by osmosis. And numpy in particular has a lot of developers
>> who are not coming from a classic FOSS programmer background! So this
>> is my personal attempt to articulate what it is, and why requiring
>> consensus is probably the best possible approach to project decision
>> making.
>> So what is "consensus"? Like, voting or something?
>> -----------------------------------------------------
>> This is surprisingly subtle and specific.
>> "Consensus" means something like, "everyone who cares is satisfied
>> with the result".
>> It does *not* mean
>> * Every opinion counts equally
>> * We vote on anything
>> * Every solution must be perfect and flawless
>> * Every solution must leave everyone overjoyed
>> * Everyone must sign off on every solution.
>> It *does* mean
>> * We invite people to speak up
>> * We generally trust individuals to decide how important their opinion is
>> * We generally trust individuals to decide whether or not they can
>> live with some outcome
>> * If they can't, then we take the time to find something better.
>> One simple way of stating this is, everyone has a veto. In practice,
>> such vetoes are almost never used, so this rule is not particularly
>> illuminating on its own. Hence, the rest of this document.
>> What a waste of time! That all sounds very pretty on paper, but we
>> have stuff to get done.
>> -----------------------------------------------------------------------------------
>> First, I'll note that this seemingly utopian scheme has a track record
>> of producing such impractical systems as TCP/IP, SMTP, DNS, Apache,
>> GCC, Linux, Samba, Python, ...
> Linux is Linus' private tree. Everything that goes in is his decision,
> everything that stays out is his decision. Of course, he delegates much of
> the work to people he trusts, but it doesn't even reach the level of a BDFL,
> it's DFL. As for consensus, it basically comes down to convincing the
> gatekeepers one level below Linus that your code might be useful. So bad
> example. Same with TCP/IP, which was basically Kahn and Cerf consulting with
> a few others and working by request of DARPA. GCC was Richard Stallman (I
> got one of the first tapes for a $30 donation), Python was Guido. Some of
> the projects later developed some form of governance but Guido, for
> instance, can veto anything he dislikes even if he is disinclined to do so.
> I'm not saying you're wrong about open source, I'm just saying that that
> each project differs and it is wrong to imply that they follow some common
> form of governance under the rubric FOSS and that they all seek consensus.
> And they certainly don't *start* that way. And there are also plenty of
> projects that fail when the prime mover loses interest or folks get tired of
> the politics.

So a few points here:

Consensus-based decision-making is an ideal and a guide, not an
algorithm. There's nothing at all inconsistent between having a BDFL
and using consensus as the primary guide for decision making -- it
just means that the BDFL chooses to exercise their power in that way,
and is generally trusted to make judgement calls about specific cases.
See Fernando's reply down-thread for an example of this.

And I'm not saying that all FOSS projects follow some common form of
governance. But I am saying that there's a substantial amount of
shared development culture across most successful FOSS projects, and a
ton of experience on how to run a project successfully. Project
management is a difficult and arcane skill set, and one that's hard to
learn except through apprenticeship and osmosis. And it's definitely
not included in most courses on programming for scientists! So it'd be
nice if numpy could avoid having to re-make some of these mistakes...

But the other effect of this being cultural values rather than
something explicit and articulated is that sometimes you can't see it
from the outside. For example:

Linux: Technically, everything you say is true. In practice, good luck
convincing Linus or a subsystem maintainer to accept your patch when
other people are raising substantive complaints. Here's an email I
googled up in a few moments, in which Linus yells at people for trying
to submit a patch to him without making sure that all interested
parties have agreed:
Stuff regularly sits outside the kernel tree in limbo for *years*
while people debate different approaches back and forth.

Of course the kernel development process is far more complicated than
I can capture with a bit of amateur anthropology here, but think about
this: why do all these multinational companies *care* what some guy
named Linus puts in his tree? I don't think they're impressed by how
his name sounds similar to "Linux". But, his trees consistently do
well enough at the things they care about that they stick around,
i.e., empirically, he's achieving reasonable consensus. And when that
fails, like, say, with the Android fork, then you can see what a mess
results. (This is the "you *will* achieve consensus sooner or later"

GCC: I just asked my friend Zack Weinberg about this via IM -- among
other things, he wrote the current C preprocessor, and used to be one
of the dozen people who had blanket write access to the GCC repo. His
response was that yes, GCC was originally run by RMS as dictator, and
then Richard Kenner as dictator, and then, "you remember that EGCS
fork that happened back in the nineties?  That was because Kenner
didn't scale, and people wanted a more consensus-based process.  And
that was so successful that it became the official branch". He also
pointed out that "the way things actually get done on a day-to-day
basis in GCC can look an awful lot like "committers do what they want"
if you only read the mailing list casually, but that's because
everyone with blanket commit rights is trusted to not fuck up".

TCP/IP: I'm not exactly privy to how Kahn and Cerf worked, but (1)
they didn't exactly design it in a vacuum and impose it by fiat -- in
fact, it was originally like a series of academic articles developed
over 5+ years, wasn't it? (2) we're not using their TCP/IP, either.
That stopped working in the mid-80s:
TCP/IP has been under IETF stewardship for a *long* time. And in any
case, nothing about consensus-based decision making rules out having a
few geniuses produce some beautiful design. It's about how you
recognize when that has happened.

I think you get the point. I don't think there are any examples of
Guido saying "hey, I like this feature, so I'm going to put it into
the next Python release, and then see whether people like it or not
and decide what to do with it then". That would be really shocking,

>> So mainly what I'm saying we should do is:
>> 1. Make it as easy as possible for people to see what's going on and
>> join the discussion. All decisions and reasoning behind decisions take
>> place in public. (On this note, it would be *really* good if pull
>> request notifications went to the list.)
>> 2. If someone raises a substantive objection, take that seriously.
>> 3. If someone says "no, this is just not going to work for me,
>> because... <something substantive here>", then it can't go in.
> What happens when someone wants to spend all their time talking about
> process? It can get kind of old.

Yeah, such discussions can certainly be exhausting.

Anyway, the answer is in the bit about obstructive people down below
-- if you think someone's behaving in a way that's destructive to the
project, then I'd suggest gathering evidence, making a case, etc.:
"You may not persuade the person in question, but that's okay as long
as you persuade everyone else."

> It seems top heavy for an organization that has maybe three people working
> on Numpy C code in their spare time. I think the ideal here would be for
> someone to produce their own version as a working example and then we could
> discuss merging code, and also have something to play with.

I'm not sure what in the above list is "top heavy" -- can you
elaborate? If three people are all we have to support many thousands
of users, then to me those rules seem like a good way for them to get
feedback and avoid wasting limited resources. And perhaps they'll help
get more people involved. (This is a point Zack made to me too: "I
wanted to get involved with GCC for some time before I actually could,
and EGCS was what made it possible".)

-- Nathaniel

More information about the NumPy-Discussion mailing list