[Numpy-discussion] Notes from meeting with Guido regarding inclusion of array package in Python core
konrad.hinsen at laposte.net
konrad.hinsen at laposte.net
Fri Mar 11 02:34:14 CST 2005
On Mar 10, 2005, at 16:28, Perry Greenfield wrote:
> On March 7th Travis Oliphant and Perry Greenfield met Guido and Paul
> Dubois to discuss some issues regarding the inclusion of an array
> package within core Python.
A good initiative - and thanks for the report!
> So what about supporting arrays as an interchange format? There are a
> number of possibilities to consider, none of which require inclusion
> of arrays into the core. It is possible for 3rd party extensions to
> optionally support arrays as an interchange format through one of the
> following mechanisms:
True, but any of these options requires a much bigger effort than
relying on a module in the standard library. Pointing out these methods
is not exactly a way of encouraging people to use arrays as an
interchange format, it's more a way of telling them that if they need a
compact interchange format badly, there is a solution.
> a) So long as the extension package has access to the necessary array
> include files, it can build the extension to use the arrays as a
> format without actually having the array package installed. The
> include files alone could be included into the core
True, but this implies nearly the same restrictions to evolution of the
array code as having it in the core. The Numeric headers have changed
frequently in the past.
> seem quite as receptive instead suggesting the next option) or could
> be packaged with extension (we would prefer the former to reduce the
> possibilities of many copies of include files). The extension could
> then be successfully compiled without
Having the header files in all client extensions is a sure recipe to
block Numeric development. Any header change would imply non-acceptance
by the end-user community.
If C were a language with implementation-independent interface
descriptions, such approaches would be reasonable, but C is... well, C.
> b) One could modify the extension build process to see if the package
> is installed and the include files are available, if so, it is built
> with the support, otherwise not.
This is already possible today, and probably used by some extension
modules. I use a similar test to build the netCDF interface selectively
(if netCDF is available), and I can tell from experience that this
causes quite some confusion for some users who install ScientificPython
before netCDF (although the instructions point this out - but nobody
seems to read instructions). But the main problem with this approach is
that it doesn't work for pre-built binary distributions, i.e. in
particular the Windows world.
> c) One could provide the support at the Python level by instead
> relying on the use of buffer objects by the extension at the C level,
> thus avoiding any dependence on the array C api. So long as the
> extension has the ability to return buffer objects
That's certainly the cleanest solution, but it also requires a serious
effort from the extension module writer: one more API to learn and use,
and conversion between buffers and arrays in all modules that
definitely need array functions.
> We talked at some length about whether it was possible to change
> Python's numeric behavior for scalars, namely support for configurable
> handling of numeric exceptions in the way numarray does it (and
> Numeric3 as well). In short, not much was resolved. Guido didn't much
> like the stack approach to the exception handling mode. His argument
> (a reasonable one) was that even if the stack allowed pushing
I agree with Guido there. It looks like a hack.
> the decimal's use of context to see if it could used as a model.
> Overall he seemed to think that setting mode on a module basis was a
> better approach. Travis and I wondered about how that could be
> implemented (it seems to imply that the exception handling needs to
> know what module or namespace is being executed in order to determine
> the mode.
That doesn't look simple. How about making error handling a
characteristic of the type itself? That would double the number of
float element types, but that doesn't seem a big deal to me. Handling
the conversions and coercions is probably a bigger headache.
> So some more thought is needed regarding this. The difficulty of
> proposing such changes and getting them accepted is likely to be
> considerable. But Travis had a brilliant idea (some may see this as
> evil but I think it has great merit). Nothing prevents a C extension
> from hijacking the existing Python scalar objects behaviors.
True, and I like that idea a lot for testing and demonstrating
concepts. Whether it's a good idea for production code is another
question, and one to be discussed with Guido and the Python team in my
> Python at all (as such), no rank-0 arrays. This will be studied
> further. One possible issue is that adding the necessary machinery to
> make numeric scalar processing consistent with that of the array
> package may introduce significant performance penalties (what is
> negligible overhead for arrays may not be for scalars).
Adding a couple of methods should not cause any overhead at all. Where
do you see the origin of the overhead?
> One last comment is that it is unlikely that any choice in this area
> prevents the need for added helper functions to the array package to
> assist in writing code that works well with scalars and arrays. There
> are likely a number of such issues. A common
That remains to be seen. I must admit that I am personally a bit
surprised by the importance this problem seems to have for many. I have
a single spot on a single module that checks for scalar vs. array,
which is negligible considering the amount of numerical code that I
> approach is to wrap all unknown objects with "asarray". This works
> reasonably well but doesn't handle the following case: If you wish to
> write a function that will accept arrays or scalars, in principal it
> would be nice to return scalars if all that was supplied were scalars.
> So functions to help determine what the output type should
That happens automatically is you use asarray() only when you
definitely need an array. I would expect this to be the case for list
arguments rather than for scalar arguments.
Laboratoire Léon Brillouin, CEA Saclay,
91191 Gif-sur-Yvette Cedex, France
Tel.: +33-1 69 08 79 25
Fax: +33-1 69 08 82 61
E-Mail: khinsen at cea.fr
More information about the Numpy-discussion