[Numpy-discussion] New project : Spyke python-to-C compiler
Sun Apr 6 19:48:50 CDT 2008
Note this message has been posted to numpy-discussion and python-dev.
Sorry for the multiple posting but I thought both python devs and
numpy users will be interested. If you believe your list should not
receive this email, let me know. Also I just wanted to introduce
myself since I may ask doubts about Python and Numpy internals from
time to time :)
I am a student at Univ of Alberta doing my masters in computing science.
I am writing a Python-to-C compiler as one part of my thesis.
The compiler, named Spyke, will be made available in a couple of weeks
and is geared towards scientific applications and will therefore focus
mostly on needs of scientific app developers.
What is Spyke?
In many performance critical projects, it is often necessary to
rewrite parts of the application in C. However writing C wrappers can
be time consuming. Spyke offers an alternative approach. You add
annotations to your Python code as strings. These strings are
discarded by the Python
interpreter but these are interpreted as types by Spyke compiler to
convert to C.
"int -> int"
def f(x): return 2*x
In this case the Spyke compiler will consider the string "int -> int"
as a decalration that the function accepts int as parameter and
returns int. Spyke will then generate a C function and a wrapper
idea is directly copied from PLW (Python Language Wrapper) project.
Once Python3k arrives, much of these declarations will be moved to
function annotations and class decorators.
This way you can do all your development and debugging interactively
using the standard Python interpreter. When you need to compile to C,
you just add type annotations to places that you want to convert and
invoke spyke on the annotated module. This is different from Pyrex
because Pyrex does not accept Python code. With Spyke, your code is
100% pure python.
Spyke has basic support for functions and classes. Spyke can do very
basic type inference for local variables in function bodies. Spyke
partial support for homogenous lists and dictionaries and fixed length tuples.
One big advantage of Spyke is that it understands at least part of
numpy. Numpy arrays are treated as fundamental types and Spyke knows
what C code to
generate for slicing/indexing of numpy arrays etc. This should help a
lot in scientific applications. Note that Spyke can handle only a
subset of Python. Exceptions, iterators, generators, runtime code
generation of any kind etc is not handled. Nested functions will be
added soon. I will definitely add some of these missing features based
on what is actually required for real world Python codes. Currently if
Spyke does not understand a function, it just leaves it as Python
code. Classes can be handled but special
methods are not currently supported. The support of classes is a
little brittle because I am trying to resolve some issues b/w old and
new style of classes.
Where is Spyke?
Spyke will be available as a binary only release in a couple of weeks.
I intend to make it open source after a few months.
Spyke is written in Python and Java and should be platform independant.
I do intend to make the source open in a few months. Right now its
undergoing very rapid development and has negligible amounts of
documentation so the source code right now is pretty useless to anyone
I need help:
However I need a bit of help. I am having a couple of problems :
a) I am finding it hard to get pure Python+NumPy testing codes. I need
more codes to test the compiler. Developing a compiler without a
test-suite is kind of useless. If you have some pure Python codes
which need better performance, please contact me. I guarantee that
your codes will not be released to public without your permission but
might be referenced in academic publications. I can also make the
compiler available to you hopefully after 10th of April. Its kind of
unstable currently. I will also need your help in annotating the
provided testing codes since I probably wont know what your
application is doing.
b) Libraries which interface with C/C++ : Many codes in SciPy for
instance have mixed language codes. Part of the code is written in
C/C++. Spyke only knows how to annotated Python codes. For C/C++
libraries wrapped into Python modules, Spyke will therefore need to
know at least 2 things :
i) The mapping of a C function name/struct etc to Python
ii) The type information of the said C function.
There are many many ways that people interact with C code. People
either write wrappers manually, or use autogenerated wrappers using
SWIG or SIP Boost.Python etc., use Pyrex or Cython while some people
use ctypes. I dont have the time or resources to support these
multitude of methods. I considered trying to parse the C code
implementing wrappers but its "non-trivial" to put it mildly. Parsing
only SWIG generated code is another possibility but its still hard.
Another approach that I am seriously considering is to support a
subset of ctypes (with additional restriction) instead. But my
question is : Is ctypes good enough for most of you? Ctypes cannot
interface with C++ code but its pure Python. However I have not seen
too many instances of people using ctypes.
c) Strings as type declarations : Do you think I should use decorators
instead at least for function type declarations?
thanks for patiently reading this,
comments and inquiries sought.
More information about the Numpy-discussion