[Numpy-discussion] Numpy-discussion Digest, Vol 19, Issue 24
Mon Apr 7 01:19:03 CDT 2008
> What will be the licensing of this project? Do you know yet?
I am thinking GPL for the compiler and LGPL for any runtime components
which should be similar to GCC. Which version : Version 2 or version 3
of the license is undecided. Will also check with uni to see if they
have any problems (they shouldnt). Also need to check with uni for
hosting. I believe I will need to host on uni servers.
> I have a couple of comments because I've been thinking along these lines.
>> What is Spyke?
>> In many performance critical projects, it is often necessary to
>> rewrite parts of the application in C. However writing C wrappers can
>> be time consuming. Spyke offers an alternative approach. You add
>> annotations to your Python code as strings. These strings are
>> discarded by the Python
>> interpreter but these are interpreted as types by Spyke compiler to
>> convert to C.
>> Example :
>> "int -> int"
>> def f(x): return 2*x
>> In this case the Spyke compiler will consider the string "int -> int"
>> as a decalration that the function accepts int as parameter and
>> returns int. Spyke will then generate a C function and a wrapper
> What about the use of decorators in this case?
I can certainly use decorators. Will implement this change soon.
> Also, it would be great to be able to create ufuncs (and general numpy
> funcs) using this approach. A decorator would work well here as well.
>> Where is Spyke?
>> Spyke will be available as a binary only release in a couple of weeks.
>> I intend to make it open source after a few months.
> I'd like to encourage you to make it available as open source as early
> as possible. I think you are likely to get help in ways you didn't
> expect. People are used to reading code, so even an alpha project can
> get help early. In fact given that you are looking for help. I
> think this may be the best way to get it.
Ok .. I will release the source along with the binary. Need to sort
some stuff out so might take a couple of weeks. Note that much of the
compiler is (for better or worse) written in Java. The codebase isnt
very OOP (full of static methods and looks more like garbage collected
C) but not too complex either. I use cpython's "compiler" module to
dump the AST into an intermediate file which is then parsed by the
compiler in java. The compiler is using AST representation throughout.
The compiler also depends upon the antlr java runtime.
For hosting, I will probably get some space at the univ servers. I
will try to get trac installed.
I will release when all the following "work":
a) Basic support for functions and classes.
b) Keyword parameters not supported.
c) Special methods not supported except __init__.
d) __init__ is treated as constructor. Custom __new__ not supported.
e) Nested functions may be broken.
f) Functions will be divided into 2 types : static and dynamic. Static
functions should not be redefined at runtime while dynamic functions
can be redefined at runtime but will be more costly to call since I
need to lookup the binding at each time. Also even though dynamic
functions can be redefined its type signature should not change. If a
static function calls another static function,
then the compiler will try to insert a call to the C function instead
of wrapped function thus bypassing the interpreter if possible.
g) Compiled classes should not redefine methods at runtime. Will have
an option to annotate classes as "final" meaning user shouldnt
subclass it. For such classes, its easier to generate efficient code
for attribute access.
Also compiled classes shouldnt dynamically add/delete attributes.
h) Users shouldnt subclass numpy array.
i) For method calls on objects, mostly the code generated will just
end up making a call to interpreter thus the performance in this case
will not be particularly good currently. For ints, floats etc the
equivalent C code will be generated so for these types the code should
be fast enough.
j) For indexing of numpy arrays, unsafe code is generated. I directly
access the array without any index checking.
k) Loops : This is the weakest point currently. I only allow for-loops
over range() or xrange() allowing easy conversion to C. Cannot loop
over elements of other lists or numpy arrays etc.
l) Exec, eval, metaclasses, dynamic class creation, dynamic
adding/deleting attributes etc not allowed inside typed code.
m) A module cannot currently mix typed and untyped code. A module has
to be completely typed/annotated or it should be left alone and not
compiled. Also a typed module cannot have arbitrary executable code
and should only consist of single statement variable declarations,
function and class definitions. Of course rest of your application can
be left untyped. In the future I will try allow mixing typed and
untyped code in a module.
n) Importing of other typed modules also mostly supported.
o) Builtin functions : range and len mostly work. But cannot guarantee
p) Lists, tuples and dictionaries can be used but need to be
homogeneous. Not all methods supported yet. Moreover the code
generated for these types mostly just generates function calls to
python interpreter so this doesnt speed things up (yet). Not very sure
how to handle subclasses of these and other builtin types.
q) For function parameters of user-defined-class types, you can
declare the parameter as "final". Example type can be declared as
"final SomeClass" meaning that you will only pass SomeClass and not
subclass. This allows the compiler (in the future) to generate better
code for attribute access.
Expected release date : Soon. Hopefully by 15th to 20th april.
> If you need help getting set up in terms of hosting it somewhere, I can
> help you do that.
>> Spyke is written in Python and Java and should be platform independant.
>> I do intend to make the source open in a few months. Right now its
>> undergoing very rapid development and has negligible amounts of
>> documentation so the source code right now is pretty useless to anyone
>> else anyway.
>> c) Strings as type declarations : Do you think I should use decorators
>> instead at least for function type declarations?
> I think you should use decorators. That way you can work towards
> having the compiler "embedded" in the decorator and happen seamlessly
> without invoking a separte "program" (it just happens when the module is
> loaded -- a.l.a weave).
Well that can be done provided certain restrictions are met. One major
problem is that it will make user applications dependent upon presence
of a JVM since the compiler is in Java.
Secondly seeing as much code as possible at compile time helps the compiler.
For example, if you have a function G called inside function F, then
the compiler needs to know the type of G which may not have been
Also I am trying to work my way towards a whole program compiler since
some of the optimizations that I want to research for my thesis are
dependent on seeing the whole program. Those havent been implemented
yet but will be in the future.
Basically, if we call the compiler on one function at a time, the
applicability of the compiler is reduced somewhat. So I will try to
provide an option to invoke it at runtime too but it will have less
features and in most cases less performance.
Also, any thoughts on interfacing with existing C code?
More information about the Numpy-discussion