[Numpy-discussion] Numpy and PEP 343
oliphant at ee.byu.edu
Fri Mar 3 14:42:05 CST 2006
Tim Hochberg wrote:
> We may want to reconsider this at least partially. I tried
> implementing a few 1-argument functions two way. First as a table
> lookup and second using a dedicated opcode. The first gave me a
> speedup of almost 60%, but the latter gave me a speedup of 100%. The
> difference suprised me, but I suspect it's do to the fact that the x86
> supports some functions directly, so the function call gets optimized
> away for sin and cos just as it does for +-*/. That implies that some
> functions should get there own opcodes, while others are not worthy.
> This little quote from wikipedia lists the functions we would want to
> give there own opcodes to:
> x86 (since the 80486DX processor) assembly language includes a stack
> based floating point unit which can perform addition, subtraction,
> negation, multiplication, division, remainder, square roots, integer
> truncation, fraction truncation, and scale by power of two. The
> operations also include conversion instructions which can load or
> store a value from memory in any of the following formats: Binary
> coded decimal, 32-bit integer, 64-bit integer, 32-bit floating
> point, 64-bit floating point or 80-bit floating point (upon loading,
> the value is converted to the currently floating point mode). The
> x86 also includes a number of transcendental functions including
> sine, cosine, tangent, arctangent, exponentiation with the base 2
> and logarithms to bases 2, 10, or e
> So, that's my new proposal: some functions get there own opcodes (sin,
> cos, ln, log10, etc), while others get shunted to table lookup (not
> sure what's in that list yet, but I'm sure there's lots).
For the same reason, I think these same functions should get their own
ufunc loops instead of using the default loop with function pointers.
Thanks for providing this link. It's a useful list.
More information about the Numpy-discussion