[Numpy-discussion] Numpy governance update
Charles R Harris
Thu Feb 16 12:28:16 CST 2012
On Thu, Feb 16, 2012 at 11:09 AM, <email@example.com> wrote:
> On Thu, Feb 16, 2012 at 12:53 PM, Charles R Harris
> <firstname.lastname@example.org> wrote:
> > On Thu, Feb 16, 2012 at 9:56 AM, Nathaniel Smith <email@example.com> wrote:
> >> On Thu, Feb 16, 2012 at 12:27 AM, Dag Sverre Seljebotn
> >> <firstname.lastname@example.org> wrote:
> >> > If non-contributing users came along on the Cython list demanding that
> >> > we set up a system to select non-developers along on a board that
> >> > have discussions in order to veto pull requests, I don't know whether
> >> > we'd ignore it or ridicule it or try to show some patience, but we
> >> > certainly wouldn't take it seriously.
> >> I'm not really worried about the Continuum having some nefarious
> >> "corporate" intent. But I am worried about how these plans will affect
> >> numpy, and I think there serious risks if we don't think about
> >> process. Money has a dramatic effect on FOSS development, and not
> >> always in a positive way, even when -- or *especially* when --
> >> everyone has the best of intentions. I'm actually *more* worried about
> >> altruistic full-time developers doing work on behalf of the community
> >> than I am about developers who are working strictly in some company's
> >> interests.
> >> Finding a good design for software is like a nasty optimization
> >> problem -- it's easy to get stuck in local maxima, and any one person
> >> has only an imperfect, noisy estimate of the objective function. So
> >> you need lots of eyes to catch mistakes, filter out the noise, and
> >> explore multiple maxima in parallel.
> >> The classic FOSS model of volunteer developers who are in charge of
> >> project direction does a *great* job of solving this problem. (Linux
> >> beat all the classic Unixen on technical quality, and it did it using
> >> college students and volunteers -- it's not like Sun, IBM, HP etc.
> >> couldn't afford better engineers! But they still lost.) Volunteers are
> >> intimately familiar with the itch they're trying to scratch and the
> >> trade-offs involved in doing so, and they need to work together to
> >> produce anything major, so you get lots of different, high-quality
> >> perspectives to help you figure out which approach is best.
> > Linux is probably a bad choice as example here. Right up to about 2002
> > was pretty much the only entry point into mainline as he applied all the
> > patches by hand and reviewed all of them. This of course slowed Linux
> > development considerably. I also had the opportunity to fix up some of
> > drivers for my own machine and can testify that the code quality of the
> > patches was mixed. Now, of course, with 10000 or more patches going in
> > during the open period of each development cycle, Linus relies on
> > lieutenants to handle the subsystems, but he can be damn scathing when he
> > takes an interest in some code and doesn't like what he sees. And he
> > be scathing, not just because he started the whole thing, but because he
> > darn good and the other developers respect that. But my point here is
> > Linus pretty much shapes Linux.
> >> Developers who are working for some corporate interest alter this
> >> balance, because in a "do-ocracy", someone who can throw a few
> >> full-time developers at something suddenly is suddenly has effectively
> >> complete control over project direction. There's no moral problem here
> >> when the "dictator" is benevolent, but suddenly you have an
> >> informational bottleneck -- even benevolent dictators make mistakes,
> >> and they certainly aren't omniscient. Even this isn't *so* bad though,
> >> so long as the corporation is scratching their own itch -- at least
> >> you can be pretty sure that whatever they produce will at least make
> >> them happy, which implies a certain level of utility.
> > Linus deals with this by saying, fork, fork, fork. Of course the gpl
> > that a more viable response.
> >> The riskiest case is paying developers to scratch someone else's itch.
> >> IIUC, that's a major goal of Travis's here, to find a way to pay
> >> developers to make numpy better for everyone. But, now you need some
> >> way for the community to figure out what "better" means, because the
> >> developers themselves don't necessarily know. It's not their itch
> >> anymore. Running a poll or whatever might be a nice start, but we all
> >> know how tough it is to extract useful design information from users.
> >> You need a lot more than that if you want to keep the quality up.
> >> Travis's proposal is that we go from a large number of self-selecting
> >> people putting in little bits of time to a small number of designated
> >> people putting in lots of time. There's a major win in terms of total
> >> effort, but you inevitably lose a lot of diversity of viewpoints. My
> >> feeling is it will only be a net win if the new employees put serious,
> >> bend-over-backwards effort into taking advantage of the volunteer
> >> community's wisdom.
> >> This is why the NA discussion seems so relevant to me here -- everyone
> >> involved absolutely had good intentions, excellent skills, etc., and
> >> yet the outcome is still a huge unresolved mess. It was supposed to
> >> make numpy more attractive for a certain set of applications, like
> >> statistical analysis, where R is currently preferred. Instead, there
> >> have been massive changes merged into numpy mainline, but most of the
> >> intended "target market" for these changes is indifferent to them;
> >> they don't solve the problem they're supposed to. And along the way
> >> we've not just spent a bunch of Enthought's money, but also wasted
> >> dozens of hours of volunteer time while seriously alienating some of
> >> numpy's most dedicated advocates in that "target market". We could
> >> debate about blame, and I'm sure there's plenty to spread around, but
> >> I also think the fundamental problem isn't one of blame at all -- it's
> >> that Mark, Charles and Travis *aren't* scratching an itch; AFAICT the
> >> NA functionality is not something they actually need themselves. Which
> >> means they're fighting uphill when trying to find the best solutions,
> >> and haven't managed it yet. And were working on a deadline, to boot.
> >> > It's obvious that one should try for consensus as long as possible,
> >> > including listening to users. But in the very end, when agreement
> >> > be reached by other means, the developers are the one making the
> >> > (This is simply a consequence that they are the only ones who can
> >> > credibly threaten to fork the project.)
> >> >
> >> > Sure, structures that includes users in the process could be useful...
> >> > but, if the devs are fine with the current situation (and I don't see
> >> > Mark or Charles complaining), then I honestly think it is quite rude
> >> > not let the matter drop after the first ten posts or so.
> >> I'm not convinced we need a formal governing body, but I think we
> >> really, really need a community norm that takes consensus *very*
> >> seriously. That principle is more important than who exactly enforces
> >> it. I guess people are worried about that turning into obstructionism
> >> or something, but seriously, this is a practical approach that works
> >> well for lots of real actual successful FOSS projects.
> >> I think it's also worth distinguishing between "users" and "developers
> >> who happen not to be numpy core developers". There are lots of
> >> experienced and skilled developers who spend their time on, say, scipy
> >> or nipy or whatever, just because numpy already works for them. That
> >> doesn't mean they don't have valuable insights or a stake in how numpy
> >> develops going forward!
> >> IMHO, everyone who can credibly participate in the technical
> >> discussion should have a veto -- and should almost never use it. And
> >> yes, that means volunteers should be able to screw up corporate
> >> schedules if that's what's best for numpy-the-project. And, to be
> >> clear, I'm not saying that random list-members somehow *deserve* to
> >> screw around with generous corporate endowments; I'm saying that the
> >> people running the corporation are going to be a lot happier in the
> >> long run if they impose this rule on themselves.
> > I'm more for the Linux model, Linus rules, the rest grovel ;)
To clarify this a bit more, there isn't an 'official' Linux. Linus' tree is
the reference due to history and his own central position in the community,
but he would argue that it is *his* tree, and what goes in, or out, is
*his* choice. If you want to do things differently, you have your own tree,
go for it. Linus is a bit a of a radical libertarian when it comes to code
Debian is much more democratic, but the stable releases also tend to lag
2-3 years behind current development. That's one of the reasons there is a
spot for Debian based distributions like Ubuntu.
I would feel a lot more comfortable with a BDFL that has code coverage
> and ABI consistency high on his priority, and not just getting the
> greatest new features in as fast as possible.
I think this is a good point, which is why the idea of a long term release
is appealing. That release should be stodgy and safe, while the ongoing
development can be much more radical in making changes. And numpy really
does need a fairly radical rewrite, just to clarify and simplify the base
code easier if nothing else. New features I'm more leery about, at least
until the code base is improved, which would be my short term priority.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NumPy-Discussion