[Numpy-discussion] Numpy and PEP 343
tim.hochberg at cox.net
Fri Mar 3 10:20:01 CST 2006
David M. Cooke wrote:
>Tim Hochberg <tim.hochberg at cox.net> writes:
>>David M. Cooke wrote:
>It's in. Check the setup.py -- I left some gcc'isms in there
>(-funroll-loops). That should be conditional on the compiler.
I does whine about that, but it ignores it, so I did to ;). The
misplaced declaration of dims I mentioned earlier was the real culprit.
>>This is probably thinking two far ahead, but an interesting thing to
>>try would be adding conditional expressions:
>>c = evaluate("(2*a + b) if (a > b) else (2*b + a)")
>>If that could be made to work, and work fast, it would save both
>>memory and time in those cases where you have to vary the computation
>>based on the value. At present I end up computing the full arrays for
>>each case and then choosing which result to use based on a mask, so it
>>takes three times as much space as doing it element by element.
>It's on my list of things to think about, along with implementing
>reductions (sum, in particular). It'd probably look more like
>c = evaluate("where(a > b, 2*a+b, 2*b+a)")
>because of the vector nature of the code. Doing the "if" elementwise
>would mean the bytecode program would have to be run for each element,
>instead of on a vector of some fixed length.
That makes sense. One thought I had with respect to the various numpy
functions (sin, cos, pow, etc) was to just have the bytecodes:
call_unary_function, function_id, store_in, source
call_binary_function, function_id, store_in, source1, source2
call_trinary_function, function_id, store_in, source1, source2, source3
Then just store pointers to the functions in relevant tables. In it's
most straightforward form, you'd need 6 character chunks of bytecode
instead of four. However, if that turns out to slow everything else
down I think it could be packed down to 4 again. The function_ids could
probably be packed into the opcode (as long as we stay below 200 or so
functions, which is probably safe), the other way to pack things down is
to require that one of the sources for trinary functions is always a
certain register (say register-0). That requires a bit more cleverness
at the compiler level, but is probably feasible.
OT, stupid subversion question: where do I find the sandbox or
whereever this got checked in?
More information about the Numpy-discussion