[Numpy-discussion] Numpy/Cython Google Summer of Code project idea

Fernando Perez fperez.net@gmail....
Fri Mar 7 03:36:40 CST 2008


On Fri, Mar 7, 2008 at 1:17 AM, Konrad Hinsen <konrad.hinsen@laposte.net> wrote:
> On 07.03.2008, at 09:59, Fernando Perez wrote:
>
>  > I doubt it's much better, and that's part of the point of the project:
>  > to identify the problems and fix them once and for all.  Getting
>  > anything fixed in pyrex was hard due to a very opaque development
>  > process, but Cython is part of the Sage umbrella and thus enjoys a
>  > very open and active development community.  Furthermore, they are
>  > explicitly interested in improving the Cython numpy support, and are
>  > willing to help along if this project goes forward.
>
>  This is very good news in my opinion. Pyrex and Cython are already
>  very useful tools for scientific computing. They lower the barrier to
>  writing extension modules significantly (compared to writing directly
>  in C), and they permit a continuous transition from a working Python
>  prototype to an efficient extension module. I have been writing all
>  my recent extension modules using Pyrex, and I definitely won't go
>  back to C. If Cython gets explicit array support, it would become an
>  even more useful tool for the NumPy community.

Thanks for your feedback and support of the idea, Konrad.

I just realized that I forgot to include this message that W. Stein
(sage lead) sent me, which I think presents many of these points very
nicely and may be useful in this discussion.

cheers

f



---------- Forwarded message ----------
From: Dag Sverre Seljebotn <dagss@student.matnat.uio.no>
Date: Tue, Mar 4, 2008 at 2:54 PM
Subject: [Cython] Thoughts on numerical computing/NumPy support
To: cython-dev@codespeak.net


Since Robert mentioned NumPy in relation with adding operator support I
 thought about sharing my more thoughts about NumPy - I'm very new to
 Cython so I guess take it for what it is worth - however what I've seen
 so far looks so promising for me that I might want to spend some time in
 a few months working on implementing some of this, which perhaps may
 make my thoughts more intereseting :-)

 Currently, Cython is mostly geared towards wrapping C code, but it is
 also an excellent foundation for being a numerical tool - but the rough
 edges are still prohibitive. A few relatively small steps (in terms of
 man-hours needed) would improve the situation a lot I think - not
 perfect, but perhaps in a few years we can have something that will
 finally kill FORTRAN :-)

 Three suggestions comes briefly here, if anyone's interested and it is
 not already discussed and decided I might flesh them out in "PEP-style"
 in the coming month?

 Note that a) is what is important for me, b) and c) is just something I
 throw along...

 a) numpy.ndarray syntax candy. Really, what one should implement is
 syntax support for PEP-3118:

 http://www.python.org/dev/peps/pep-3118/

 Because this protocol will be shared between NumPy, PIL etc. in Python 3
 it could make sense to simply have "native"/hard-coded support for this
 aspect without necesarrily making it a generic operator feature, and one
 can then use the same approach as will be needed for buffers in Python 3
 for NumPy in Python 2?

 Example (where "array" is considered a new, Cython-native type that will
 have automatic conversion from any NumPy arrays and Python 3 buffers):

 def myfunc(array<2, unsigned char> arr):
   arr[4, 5] = 10

 might be translated to the equivalent of the currently legal:

 def myfunc(numpy.ndarray arr):
   if arr.nd != 2 or arr.dtype != numpy.dtype(numpy.uint8):
     raise ValueError("Must pass 2-dimensional uint8 array.")
   cdef unsigned char* arr_buf = <unsigned char*>arr.data
   arr.data[4 * arr.strides[0] + 5 * arr.strides[1]] = 10

 (Probably caching the strides in local variables etc.). That should do
 as a first implementation -- it is always possible to be more
 sophisticated, but this little will allow NumPyers to simply dive in.
 Specifically, the number of dimensions must be declared first and only
 direct access in that many dimensions are allowed. Slices etc. should be
 less important (they can be done on the Python object instead).

 Moving on from here, one should probably instead define bufferinfo from
 PEP-3118 and make it say

 def myfunc(bufferinfo arr):
   if arr.ndim != 2 or arr.format != "B") or arr.readonly:
     raise ValueError("Must pass writeable 2-dimensional buffer with
 format 'B'.")
 ...

 with automatic conversion from NumPy arrays to bufferinfo.


 b) Allow numpy types? Basically, make it possible to say "cdef uint8
 myvar", at least for in-function-variables that is not interfacing with
 C code, so that for numerical use one doesn't need to learn C. This can
 be in addition, so it should not break existing code, though I can
 understand resentment against the idea as well.

 c) Probably controversial: More Pythonic syntax. A syntax for decoration
 of function arguments is decided upon (at least in Python 3), so to
 align with that one could allow for stuff like

 @Compile
 def myfunc(a: uint8, b: array(2, uint8), c: int = 10):
   d: ptr(int) = &a
   print a, b, c, d

 Which is "almost" Python - only the definition of d is different, but
 consistency talks for change there as well. This can also be in addition
 to the existing syntax so it should not break anything (allowing, say,
 only one type of syntax per function).

 But a) is what is interesting here...

 --
 Dag Sverre


More information about the Numpy-discussion mailing list