[Numpy-discussion] numexpr thoughts
Tim Hochberg
tim.hochberg at cox.net
Tue Mar 7 11:09:02 CST 2006
David M. Cooke wrote:
>On Tue, Mar 07, 2006 at 11:33:45AM -0700, Tim Hochberg wrote:
>
>
>>Tim Hochberg wrote:
>>
>>
>>
>>>>3. Reduction. I figure this could be done at the end of the program in
>>>> each loop: sum or multiply the output register. Downcasting the
>>>> output could be done here too.
>>>>
>>>>
>>>>
>>I'm still not excited about summing over the whole output buffer though.
>>That ends up allocating and scanning through a whole extra buffer which
>>may result in a signifigant speed and memory hit for large arrays. Since
>>if we're only doing this on the way out, there should be no problem just
>>allocating a single double (or complex) to do the sum in. On the way
>>in, this could be set to zero or one based on what the last opcode is
>>(sum or product). Then the SUM opcode could simply do something like:
>>
>>
>
>No, no, we'd just sum over the 128 element output vector (mem[0]), and
>add the result to cumulative sum. That vector should already be in
>cache, as the last op would put it there.
>
>
Ah! Ok then. That's what I was thinking of too. For some reason I
thought you were proposing building the whole result vector then summing it.
Here's another wrinkle: how do we deal with:
>>> a = reshape(arange(9), (3,3))
>>> sum(a)
array([ 9, 12, 15])
Just forbid it? For the time being at least?
>
>
>>BTW, the cleanup of the interpreter looks pretty slick.
>>
>>
>
>Not finished yet :-) Look for a checkin today (if I have time).
>
>
They didn't seem to have any speed advantage, so I ripped out all the
compare with constant opcodes amd used COPY_C instead. I'm probably
going to rip out OP_WHERE_XXC and OP_WHERE_XCX depending on the timings
there. Should I also kill OP_ADD_C and friends as well while I'm at it?
-tim
More information about the Numpy-discussion
mailing list