[Numpy-discussion] What is consensus anyway

David Cournapeau cournape@gmail....
Wed Apr 25 18:06:04 CDT 2012


On Wed, Apr 25, 2012 at 10:54 PM, Matthew Brett <matthew.brett@gmail.com>wrote:

> Hi,
>
> On Wed, Apr 25, 2012 at 2:35 PM, Travis Oliphant <travis@continuum.io>
> wrote:
> >>
> >> Do you agree that Numpy has not been very successful in recruiting and
> >> maintaining new developers compared to its large user-base?
> >>
> >> Compared to - say - Sympy?
> >>
> >> Why do you think this is?
> >
> > I think it's mostly because it's infrastructure that is a means to an
> end.   I certainly wasn't excited to have to work on NumPy originally, when
> my main interest was SciPy.    I've come to love the interesting plateau
> that NumPy lives on.    But, I think it mostly does the job it is supposed
> to do.     The fact that it is in C is also not very sexy.   It is also
> rather complicated with a lot of inter-related parts.
> >
> > I think NumPy could do much, much more --- but getting there is going to
> be a challenge of execution and education.
> >
> > You can get to know the code base.  It just takes some time and
> patience.   You also have to be comfortable with compilers and building
> software just to tweak the code.
> >
> >
> >>
> >> Would you consider asking that question directly on list and asking
> >> for the most honest possible answers?
> >
> > I'm always interested in honest answers and welcome any sincere
> perspective.
>
> Of course, there are potential explanations:
>
> 1) Numpy is too low-level for most people
> 2) The C code is too complicated
> 3) It's fine already, more or less
>
> are some obvious ones. I would say there are the easy answers. But of
> course, the easy answer may not be the right answer. It may not be
> easy to get right answer [1].   As you can see from Alan Isaac's reply
> on this thread, even asking the question can be taken as being in bad
> faith.  In that situation, I think you'll find it hard to get sincere
> replies.
>

While I don't think jumping into NumPy C code is as difficult as some
people made it to be, I think numpy reaped most of the low-hanging fruits,
and is now at a stage where it requires massive investment to get
significantly better.

I would suggest a different question, whose answer may serve as a proxy to
uncover the lack of contributions: what needs to be done in NumPy, and how
can we make it simpler for newcommers ? Here is an incomplete,
unshamelessly biased list:

  - Less dependencies on CPython internals
  - Allow for 3rd parties to extend numpy at the C level in more
fundamental ways (e.g. I wished something like half-float dtype could be
more easily developed out of tree)
  - Separate memory representation from higher level representation
(slicing, broadcasting, etc…), to allow arrays to "sit" on non-contiguous
memory areas, etc…
  - Test and performance infrastructure so we can track our evolution, get
coverage of our C code, etc…
  - Fix bugs
  - Better integration with 3rd party on-disk storage (database, etc…)

None of that is particularly simple nor has a fast learning curve, except
for fixing bugs and maybe some of the infrastructure. I think most of this
is necessary for the things Travis talked about a few weeks ago.

What could make contributions easier:
  - different levels of C API documentation (still lacking anything besides
reference)
  - ways to detect early when we break ABI, slightly more obscure platforms
(we need good CI, ways to publish binaries that people can easily test,
etc...)
  - improve infrastructure so that we can focus on the things we want to
work on (improve the dire situation with bug tracking, etc…)

Also, lots of people just don't know/want to know C. But people with say
web skills would be welcome: we have a website that could use some help…

So
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20120426/5cae4af5/attachment-0001.html 


More information about the NumPy-Discussion mailing list