[Numpy-discussion] Numpy and PEP 343

Tim Hochberg tim.hochberg at cox.net
Fri Mar 3 14:36:04 CST 2006

David M. Cooke wrote:

>Tim Hochberg <tim.hochberg at cox.net> writes:
>>That makes sense. One thought I had with respect to the various numpy
>>functions (sin, cos, pow, etc) was to just have the bytecodes:
>>call_unary_function, function_id, store_in, source
>>call_binary_function, function_id, store_in, source1, source2
>>call_trinary_function, function_id, store_in, source1, source2, source3
>>Then just store pointers to the functions in relevant tables. In it's
>>most straightforward form, you'd need 6 character chunks of bytecode
>>instead of four.  However, if that turns out to slow everything else
>>down I think it could be packed down to 4 again. The function_ids
>>could probably be packed into the opcode (as long as we stay below 200
>>or so functions, which is probably safe), the other way to pack things
>>down is to require that one of the sources for trinary functions is
>>always a certain register (say register-0). That requires a bit more
>>cleverness at the compiler level, but is probably feasible.
>That's along the lines I'm thinking of. It seems to me that if
>evaluating the function requires a function call (and not an inlined
>machine instruction like the basic ops), then we may as well dispatch
>like this (plus, it's easier :). This could also allow for user
>extensions. Binary and trinary (how many of those do we have??) could
>maybe be handled by storing the extra arguments in a separate array.
We may want to reconsider this at least partially. I tried implementing 
a few 1-argument functions two way. First as a table lookup and second 
using a dedicated opcode. The first gave me a speedup of almost 60%, but 
the latter gave me a speedup of 100%. The difference suprised me, but I 
suspect it's do to the fact that the x86 supports some functions 
directly, so the function call gets optimized away for sin and cos just 
as it does for +-*/. That implies that some functions should get there 
own opcodes, while others are not worthy. This little quote from 
wikipedia lists the functions we would want to give there own opcodes to:

    x86 (since the 80486DX processor) assembly language includes a stack
    based floating point unit which can perform addition, subtraction,
    negation, multiplication, division, remainder, square roots, integer
    truncation, fraction truncation, and scale by power of two. The
    operations also include conversion instructions which can load or
    store a value from memory in any of the following formats: Binary
    coded decimal, 32-bit integer, 64-bit integer, 32-bit floating
    point, 64-bit floating point or 80-bit floating point (upon loading,
    the value is converted to the currently floating point mode). The
    x86 also includes a number of transcendental functions including
    sine, cosine, tangent, arctangent, exponentiation with the base 2
    and logarithms to bases 2, 10, or e

So, that's my new proposal: some functions get there own opcodes (sin, 
cos, ln, log10, etc), while others get shunted to table lookup (not sure 
what's in that list yet, but I'm sure there's lots).


>I'm going to look at adding more smarts to the compiler, too. Got a
>couple books on them :-)
>Different data types could be handled by separate input arrays, and a
>conversion opcode ('int2float', say).

More information about the Numpy-discussion mailing list