[Numpy-discussion] numpy ufuncs and COREPY - any info?

David Cournapeau david@ar.media.kyoto-u.ac...
Thu May 28 20:56:15 CDT 2009


Andrew Friedley wrote:
> David Cournapeau wrote:
>   
>> Francesc Alted wrote: 
>>     
>>> No, that seems good enough.  But maybe you can present results in cycles/item.  
>>> This is a relatively common unit and has the advantage that it does not depend 
>>> on the frequency of your cores.
>>>       
>
> Sure, cycles is fine, but I'll argue that in this case the number still 
> does depend on the frequency of the cores, particularly as it relates to 
> the frequency of the memory bus/controllers.  A processor with a higher 
> clock rate and higher multiplier may show lower performance when 
> measuring in cycles because the memory bandwidth has not necessarily 
> increased, only the CPU clock rate.  Plus between say a xeon and opteron 
> you will have different SSE performance characteristics.  So really, any 
> sole number/unit is not sufficient without also describing the system it 
> was obtained on :)
>   

Yes, that's why people usually add the CPU type with the
cycles/operation count :) It makes comparison easier. Sure, the
comparison is not accurate because differences in CPU may make a
difference. But with cycles/computation, we could see right away that
something was strange with the numpy timing, so I think it is a better
representation for discussion/comoparison.

>
> I can do minimum.  My motivation for average was to show a common-case 
> performance an application might see.  If that application executes the 
> ufunc many times, the performance will tend towards the average.
>   

The rationale for minimum is to remove external factors like other tasks
taking CPU, etc...

>
> I was waiting for someone to bring this up :)  I used an implementation 
> that I'm now thinking is not accurate enough for scientific use.  But 
> the question is, what is a concrete measure for determining whether some 
> cosine (or other function) implementation is accurate enough?
Nan/inf/zero handling should be tested for every function (the exact
behavior for standard functions is part of the C standard), and then,
the particular values depend on the function and implementation. If your
implementation has several codepath, each codepath should be tested. But
really, most implementations just test for a few more or less random
known values. I know the GNU libc has some tests for the math library,
for example.

For single precision, brute force testing against a reference
implementation for every possible input is actually feasible, too :)

David


More information about the Numpy-discussion mailing list