[Numpy-discussion] Numpy governance update

Nathaniel Smith njs@pobox....
Thu Feb 16 10:56:39 CST 2012


On Thu, Feb 16, 2012 at 12:27 AM, Dag Sverre Seljebotn
<d.s.seljebotn@astro.uio.no> wrote:
> If non-contributing users came along on the Cython list demanding that
> we set up a system to select non-developers along on a board that would
> have discussions in order to veto pull requests, I don't know whether
> we'd ignore it or ridicule it or try to show some patience, but we
> certainly wouldn't take it seriously.

I'm not really worried about the Continuum having some nefarious
"corporate" intent. But I am worried about how these plans will affect
numpy, and I think there serious risks if we don't think about
process. Money has a dramatic effect on FOSS development, and not
always in a positive way, even when -- or *especially* when --
everyone has the best of intentions. I'm actually *more* worried about
altruistic full-time developers doing work on behalf of the community
than I am about developers who are working strictly in some company's
interests.

Finding a good design for software is like a nasty optimization
problem -- it's easy to get stuck in local maxima, and any one person
has only an imperfect, noisy estimate of the objective function. So
you need lots of eyes to catch mistakes, filter out the noise, and
explore multiple maxima in parallel.

The classic FOSS model of volunteer developers who are in charge of
project direction does a *great* job of solving this problem. (Linux
beat all the classic Unixen on technical quality, and it did it using
college students and volunteers -- it's not like Sun, IBM, HP etc.
couldn't afford better engineers! But they still lost.) Volunteers are
intimately familiar with the itch they're trying to scratch and the
trade-offs involved in doing so, and they need to work together to
produce anything major, so you get lots of different, high-quality
perspectives to help you figure out which approach is best.

Developers who are working for some corporate interest alter this
balance, because in a "do-ocracy", someone who can throw a few
full-time developers at something suddenly is suddenly has effectively
complete control over project direction. There's no moral problem here
when the "dictator" is benevolent, but suddenly you have an
informational bottleneck -- even benevolent dictators make mistakes,
and they certainly aren't omniscient. Even this isn't *so* bad though,
so long as the corporation is scratching their own itch -- at least
you can be pretty sure that whatever they produce will at least make
them happy, which implies a certain level of utility.

The riskiest case is paying developers to scratch someone else's itch.
IIUC, that's a major goal of Travis's here, to find a way to pay
developers to make numpy better for everyone. But, now you need some
way for the community to figure out what "better" means, because the
developers themselves don't necessarily know. It's not their itch
anymore. Running a poll or whatever might be a nice start, but we all
know how tough it is to extract useful design information from users.
You need a lot more than that if you want to keep the quality up.

Travis's proposal is that we go from a large number of self-selecting
people putting in little bits of time to a small number of designated
people putting in lots of time. There's a major win in terms of total
effort, but you inevitably lose a lot of diversity of viewpoints. My
feeling is it will only be a net win if the new employees put serious,
bend-over-backwards effort into taking advantage of the volunteer
community's wisdom.

This is why the NA discussion seems so relevant to me here -- everyone
involved absolutely had good intentions, excellent skills, etc., and
yet the outcome is still a huge unresolved mess. It was supposed to
make numpy more attractive for a certain set of applications, like
statistical analysis, where R is currently preferred. Instead, there
have been massive changes merged into numpy mainline, but most of the
intended "target market" for these changes is indifferent to them;
they don't solve the problem they're supposed to. And along the way
we've not just spent a bunch of Enthought's money, but also wasted
dozens of hours of volunteer time while seriously alienating some of
numpy's most dedicated advocates in that "target market". We could
debate about blame, and I'm sure there's plenty to spread around, but
I also think the fundamental problem isn't one of blame at all -- it's
that Mark, Charles and Travis *aren't* scratching an itch; AFAICT the
NA functionality is not something they actually need themselves. Which
means they're fighting uphill when trying to find the best solutions,
and haven't managed it yet. And were working on a deadline, to boot.

> It's obvious that one should try for consensus as long as possible,
> including listening to users. But in the very end, when agreement can't
> be reached by other means, the developers are the one making the calls.
> (This is simply a consequence that they are the only ones who can
> credibly threaten to fork the project.)
>
> Sure, structures that includes users in the process could be useful...
> but, if the devs are fine with the current situation (and I don't see
> Mark or Charles complaining), then I honestly think it is quite rude to
> not let the matter drop after the first ten posts or so.

I'm not convinced we need a formal governing body, but I think we
really, really need a community norm that takes consensus *very*
seriously. That principle is more important than who exactly enforces
it. I guess people are worried about that turning into obstructionism
or something, but seriously, this is a practical approach that works
well for lots of real actual successful FOSS projects.

I think it's also worth distinguishing between "users" and "developers
who happen not to be numpy core developers". There are lots of
experienced and skilled developers who spend their time on, say, scipy
or nipy or whatever, just because numpy already works for them. That
doesn't mean they don't have valuable insights or a stake in how numpy
develops going forward!

IMHO, everyone who can credibly participate in the technical
discussion should have a veto -- and should almost never use it. And
yes, that means volunteers should be able to screw up corporate
schedules if that's what's best for numpy-the-project. And, to be
clear, I'm not saying that random list-members somehow *deserve* to
screw around with generous corporate endowments; I'm saying that the
people running the corporation are going to be a lot happier in the
long run if they impose this rule on themselves.

-- Nathaniel


More information about the NumPy-Discussion mailing list