[Numpy-discussion] Back to numexpr

Tim Hochberg tim.hochberg at cox.net
Wed Jun 14 08:50:08 CDT 2006


Ivan Vilata i Balaguer wrote:

>En/na Tim Hochberg ha escrit::
>
>  
>
>>Francesc Altet wrote:
>>[...]
>>    
>>
>>>Uh, I'm afraid that yes. In PyTables, int64, while being a bit bizarre for 
>>>some users (specially in 32-bit platforms), is a type with the same rights 
>>>than the others and we would like to give support for it in numexpr. In fact, 
>>>Ivan Vilata already has implemented this suport in our local copy of numexpr, 
>>>so perhaps (I say perhaps because we are in the middle of a big project now 
>>>and are a bit scarce of time resources) we can provide the patch against the 
>>>latest version of David for your consideration. With this we can solve the 
>>>problem with int64 support in 32-bit platforms (although addmittedly, the VM 
>>>gets a bit more complicated, I really think that this is worth the effort)
>>>      
>>>
>>In addition to complexity, I worry that we'll overflow the code cache at 
>>some point and slow everything down. To be honest I have no idea at what 
>>point that is likely to happen, but I know they worry about it with the 
>>Python interpreter mainloop. Also, it becomes much, much slower to 
>>compile past a certain number of case statements under VC7, not sure 
>>why. That's mostly my problem though.
>>[...]
>>    
>>
>
>Hi!  For your information, the addition of separate, predictably-sized
>int (int32) and long (int64) types to numexpr was roughly as complicated
>as the addition of boolean types, so maybe the increase of complexity
>isn't that important (but I recognise I don't know the effect on the
>final size of the VM).
>  
>
I didn't expect it to be any worse than booleans (I would imagine it's 
about the same). It's just that there's a point at which we are going to 
slow down the VM do to sheer size. I don't know where that point is, so 
I'm cautious. Booleans seem like they need to be supported directly in 
the interpreter, while only one each (the largest one) of ints, floats 
and complexs do. Booleans are different since they have different 
behaviour than integers, so they need a separate set of opcodes. For 
floats and complexes, the largest is also the most commonly used, so 
this works out well. For ints on the other hand, int32 is the most 
commonly used, but int64 is the largest, so the approach of using the 
largest is going to result in a speed hit for the most common integer 
case. Implementing both, as you've done solves that, but as I say, I 
worry about making the interpreter core too big.

I expect that you've timed things before and after the addition of int64 
and not gotten a noticable slowdown. That's good, although it doesn't 
entirely mean we're out of the woods since I expect that more opcodes 
that we just need to add will show up and at some point I we may run 
into an opcode crunch. Or maybe I'm just being paranoid.

>As soon as I have time (and a SVN version of numexpr which passes the
>tests ;) ) I will try to merge back the changes and send a patch to the
>list.  Thanks for your patience! :)
>  
>
I look forward to seeing it. Now if only I can get svn numexpr to stop 
seqfaulting under windows I'll be able to do something useful...

-tim



>::
>
>	Ivan Vilata i Balaguer   >qo<   http://www.carabos.com/
>	       Cárabos Coop. V.  V  V   Enjoy Data
>	                          ""
>
>  
>






More information about the Numpy-discussion mailing list