[Numpy-discussion] DARPA funding for Blaze and passing the NumPy torch
Mon Dec 17 01:07:03 CST 2012
On Mon, Dec 17, 2012 at 7:07 AM, Travis Oliphant <firstname.lastname@example.org>wrote:
> Hello all,
> There is a lot happening in my life right now and I am spread quite thin
> among the various projects that I take an interest in. In particular, I
> am thrilled to publicly announce on this list that Continuum Analytics has
> received DARPA funding (to the tune of at least $3 million) for Blaze,
> Numba, and Bokeh which we are writing to take NumPy, SciPy, and
> visualization into the domain of very large data sets. This is part of
> the XDATA program, and I will be taking an active role in it. You can
> read more about Blaze here: http://blaze.pydata.org. You can read more
> about XDATA here: http://www.darpa.mil/Our_Work/I2O/Programs/XDATA.aspx
Hi Travis, that is fantastic news, congratulations! I can't wait to see
what you guys will come up with in the near future.
Also thank you for the rest of this thoughtful post; it'll take me some
time to digest but I enjoyed the reflection on the past.
> I personally think Blaze is the future of array-oriented computing in
> Python. I will be putting efforts and resources next year behind making
> that case. How it interacts with future incarnations of NumPy, Pandas, or
> other projects is an interesting and open question. I have no doubt the
> future will be a rich ecosystem of interoperating array-oriented
> data-structures. I invite anyone interested in Blaze to participate in
> the discussions and development at
> https://groups.google.com/a/continuum.io/forum/#!forum/blaze-dev or watch
> the project on our public GitHub repo:
> https://github.com/ContinuumIO/blaze. Blaze is being incubated under the
> ContinuumIO GitHub project for now, but eventually I hope it will receive
> its own GitHub project page later next year. Development of Blaze is
> early but we are moving rapidly with it (and have deliverable deadlines ---
> thus while we will welcome input and pull requests we won't have a ton of
> time to respond to simple queries until
> at least May or June). There is more that we are working on behind
> the scenes with respect to Blaze that will be coming out next year as well
> but isn't quite ready to show yet.
> As I look at the coming months and years, my time for direct involvement
> in NumPy development is therefore only going to get smaller. As a result
> it is not appropriate that I remain as "head steward" of the NumPy project
> (a term I prefer to BFD12 or anything else). I'm sure that it is apparent
> that while I've tried to help personally where I can this year on the NumPy
> project, my role has been more one of coordination, seeking funding, and
> providing expert advice on certain sections of code. I fundamentally
> agree with Fernando Perez that the responsibility of care-taking open
> source projects is one of stewardship --- something akin to public service.
> I have tried to emulate that belief this year --- even while not always
> It is time for me to make official what is already becoming apparent to
> observers of this community, namely, that I am stepping down as someone who
> might be considered "head steward" for the NumPy project and officially
> leaving the development of the project in the hands of others in the
> community. I don't think the project actually needs a new "head steward"
> --- especially from a development perspective. Instead I see a lot of
> strong developers offering key opinions for the project as well as a great
> set of new developers offering pull requests.
> My strong suggestion is that development discussions of the project
> continue on this list with consensus among the active participants being
> the goal for development. I don't think 100% consensus is a rigid
> requirement --- but certainly a super-majority should be the goal, and
> serious changes should not be made with out a clear consensus. I would
> pay special attention to under-represented people (users with intense usage
> of NumPy but small voices on this list). There are many of them. If
> you push me for specifics then at this point in NumPy's history, I would
> say that if Chuck, Nathaniel, and Ralf agree on a course of action, it will
> likely be a good thing for the project. I suspect that even if only 2 of
> the 3 agree at one time it might still be a good thing (but I would expect
> more detail and discussion). There are others whose opinion should be
> sought as well: Ondrej Certik, Perry Greenfield, Robert Kern, David
> Cournapeau, Francesc Alted, and Mark Wiebe to
> name a few. For some questions, I might even seek input from people
> like Konrad Hinsen and Paul Dubois --- if they have time to give it. I
> will still be willing to offer my view from time to time and if I am asked.
> Greg Wilson (of Software Carpentry fame) asked me recently what letter I
> would have written to myself 5 years ago. What would I tell myself to do
> given the knowledge I have now? I've thought about that for a bit, and
> I have some answers. I don't know if these will help anyone, but I offer
> them as hopefully instructive:
> 1) Do not promise to not break the ABI of NumPy --- and in fact
> emphasize that it will be broken at least once in the 1.X series. NumPy
> was designed to add new data-types --- but not without breaking the ABI.
> NumPy has needed more data-types and still needs even more. While it's
> not beautifully simple to add new data-types, it can be done. But, it is
> impossible to add them without breaking the ABI in some fashion. The
> desire to add new data-types *and* keep ABI compatibility has led to
> significant pain. I think the ABI non-breakage goal has been amplified by
> the poor state of package management in Python. The fact that it's
> painful for someone to update their downstream packages when an upstream
> ABI breaks (on Windows and Mac in particular) has put a lot of unfortunate
> pressure on this community. Pressure that was not envisioned or
> understood when I was writing NumPy.
> (As an aside: This is one reason Continuum has invested resources in
> building the conda tool and a completely free set of binary packages called
> Anaconda CE which is becoming more and more usable thanks to the efforts of
> Bryan Van de Ven and Ilan Schnell and our testing team at Continuum. The
> conda tool: http://docs.continuum.io/conda/index.html is open source and
> BSD licensed and the next release will provide the ability to build
> packages, build indexes on package repositories and interface with pip.
> Expect a blog-post in the near future about how cool conda is!).
> 2) Don't create array-scalars. Instead, make the data-type object
> a meta-type object whose instances are the items returned from NumPy
> arrays. There is no need for a separate array-scalar object and in fact
> it's confusing to the type-system. I understand that now. I did not
> understand that 5 years ago.
> 3) Special-case small arrays to avoid the memory indirection and
> look at PDL so that generalized ufuncs are supported from the beginning.
> 4) Define missing-value data-types and labels on the dimensions
> and arrays
> 5) Define a standard "dictionary of NumPy arrays" interface as the
> basic "structure of arrays" concept to go with the "array of structures"
> that structured arrays provide.
> 6) Start work on SQL interface to NumPy arrays *now*
> Additional comments I would make to someone today:
> 1) Most of NumPy should be written in Python with Numba used as
> the compiler (particularly as soon as Numba gets the ability to create
> Python extension modules which is in the next release).
> 2) There are still many, many optimizations that can be made in
> NumPy run-time (especially in the face of modern hardware).
> I will continue to be available to answer questions and I may chime in
> here and there on pull requests. However, most of my time for NumPy will
> be on administrative aspects of the project where I will continue to take
> an active interest. To help make sure that this happens in a transparent
> way, I would like to propose that "administrative" support of the project
> be left to the NumFOCUS board of which I am currently 1 of 9 members. The
> other board members are currently: Ralf Gommers, Anthony Scopatz, Andy
> Terrel, Prabhu Ramachandran, Fernando Perez, Emmanuelle Gouillart, Jarrod
> Millman, and Perry Greenfield. While NumFOCUS basically seeks to
> promote and fund the entire scientific Python stack, I think it can also
> play a role in helping to administer some of the core projects which the
> board members themselves have a personal interest in.
> By administrative support, I mean decisions like "what should be done with
> any NumPy IP or web-domains" or "what kind of commercially-related ads or
> otherwise should go on the NumPy home page", or "what should be done with
> the NumPy github account", etc. --- basically anything that requires an
> executive decision that is not directly development related. I don't
> expect there to be many of these decisions. But, when they show up, I
> would like them to be made in as transparent and public of a way as
> possible. In practice, the way I see this working is that there are
> members of the NumPy community who are (like me) particularly interested in
> admin-related questions and serve on a NumPy team in the NumFOCUS
> organization. I just know I'll be attending NumFOCUS board meetings,
> and I would like to help move administrative decisions forward with NumPy
> as part of the time I spend thinking about NumFOCUS.
> If people on this list would like to play an active role in those admin
> discussions, then I would heartily welcome them into NumFOCUS membership
> where they would work with interested members of the NumFOCUS board (like
> me and Ralf) to help direct that organization. I would really love to
> have someone from this list volunteer to serve on the NumPy team as part of
> the NumFOCUS project. I am certainly going to be interested in the
> opinions of people who are active participants on this list and on GitHub
> pages for NumPy on anything admin related to NumPy, and I expect Ralf would
> also be very interested in those views.
> One admin discussion that I will bring up in another email (as this one is
> already too long) is about making 2 or 3 lists for NumPy such as
> email@example.com, firstname.lastname@example.org, and numpy-users@numpy-org.
> Just because I'll be spending more time on Blaze, Numba, Bokeh, and the
> PyData ecosystem does not mean that I won't be around for NumPy. I will
> continue to promote NumPy. My involvement with Continuum connects me to
> NumPy as Continuum continues to offer commercial support contracts for
> NumPy (and SciPy and other open source projects). Continuum will also
> continue to maintain its Github NumPy project which will contain pull
> requests from our company that we are working to get into the mainline
> branch. Continuum will also continue to provide resources for
> release-management of NumPy (we have been funding Ondrej in this role for
> the past 6 months --- though I would like to see this happen through
> NumFOCUS in the future even if Continuum provides much of the money). We
> also offer optimized versions of NumPy in our commercial Anaconda
> distribution (Anaconda CE is free and open source).
> Also, I will still be available for questions and help (I'm not
> disappearing --- just making it clear that I'm stepping back into an
> occasional NumPy developer role). It has been extremely gratifying to see
> the number of pull-requests, GitHub-conversations, and code contributions
> increase this year. Even though the 1.7 release has taken a long time to
> stabilize, there have been a lot of people participating in the discussion
> and in helping to track down the problems, figure out what to do, and fix
> them. It even makes it possible for people to think about 1.7 as a
> long-term release.
> I will continue to hope that the spirit of openness, tolerance, respect,
> and gratitude continue to permeate this mailing list, and that we continue
> to seek to resolve any differences with trust and mutual respect. I know
> I have offended people in the past with quick remarks and actions made
> sometimes in haste without fully realizing how they might be taken. But,
> I also know that like many of you I have always done the very best I could
> for moving Python for scientific computing forward in the best way I know
> Thank you for the great memories. If you will forgive a little
> sentiment: My daughter who is in college now was 3 years old when I began
> working with this community and went down a road that would lead to my
> involvement with SciPy and NumPy. I have marked the building of my family
> and the passage of time with where the Python for Scientific Computing
> Community was at. Like many of you, I have given a great deal of
> attention and time to building this community. That sacrifice and time
> has led me to love what we have created. I know that I leave this
> segment of the community with the tools in better hands than mine. I am
> hopeful that NumPy will continue to be a useful array library for the
> Python community for many years to come even as we all continue to build
> new tools for the future.
> Very best regards,
> NumPy-Discussion mailing list
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NumPy-Discussion