[Numpy-discussion] Insights / lessons learned from NumPy design
Dag Sverre Seljebotn
Wed Jan 9 12:04:23 CST 2013
On 01/09/2013 04:41 PM, Benjamin Root wrote:
> On Wed, Jan 9, 2013 at 9:58 AM, Nathaniel Smith <email@example.com
> <mailto:firstname.lastname@example.org>> wrote:
> On Wed, Jan 9, 2013 at 2:53 PM, Alan G Isaac <email@example.com
> <mailto:firstname.lastname@example.org>> wrote:
> > I'm just a Python+NumPy user and not a CS type.
> > May I ask a naive question on this thread?
> > Given the work that has (as I understand it) gone into
> > making NumPy usable as a C library, why is the discussion not
> > going in a direction like the following:
> > What changes to the NumPy code base would be required for it
> > to provide useful ndarray functionality in a C extension
> > to Clojure? Is this simply incompatible with the goal that
> > Clojure compile to JVM byte code?
> IIUC that work was done on a fork of numpy which has since been
> abandoned by its authors, so... yeah, numpy itself doesn't have much
> to offer in this area right now. It could in principle with a bunch of
> refactoring (ideally not on a fork, since we saw how well that went),
> but I don't think most happy current numpy users are wishing they
> could switch to writing Lisp on the JVM or vice-versa, so I don't
> think it's surprising that no-one's jumped up to do this work.
> If I could just point out that the attempt to fork numpy for the .NET
> work was done back in the subversion days, and there was little-to-no
> effort to incrementally merge back changes to master, and vice-versa.
> With git as our repository now, such work may be more feasible.
This is a matter of personal software design taste I guess, so the
following is very subjective.
I don't think there's anything at all to gain from this. In 2013 (and
presumably, the future), a static C or C++ library is IMO fundamentally
incompatible with achieving optimal performance. Going through a major
refactor simply to end up with something that's no faster and no more
flexible than what NumPy is today seems sort of pointless to me.
What one wants is to generate ufuncs etc. on the fly using LLVM that are
tuned to the specific tiling pattern of a specific operation, not a
static C or C++ library (even with C++ meta-programming, the
combinatorial explosion kills you if you do it all at compile-time).
Granted, one could probably write a C++ library that was more of a
compiler, using LLVM to emit code. But that's starting all over so not
really relevant to the question of a NumPy refactor.
This is how I understand Continuum thinks too, with Numba as a back-end
for Blaze. (And Travis also spoke about this in his "farewell address".)
Finally, Mark Florisson sort of started this with the 'minivect' library
last summer which could as a "ufunc" backend both for Cython and Numba
(which for this purpose are different languages), however as I
understand it focus is now more on developing Numba directly rather than
minivect (which is understandable as that's quicker).
More information about the NumPy-Discussion