[Numpy-discussion] [ANN] Nanny, faster NaN functions

Keith Goodman kwgoodman@gmail....
Fri Nov 19 14:19:56 CST 2010


On Fri, Nov 19, 2010 at 12:10 PM,  <josef.pktd@gmail.com> wrote:

> What's the speed advantage of nanny compared to np.nansum that you
> have if the arrays are larger, say (1000,10) or (10000,100) axis=0 ?

Good point. In the small examples I showed so far maybe the speed up
was all in overhead. Fortunately, that's not the case:

>> arr = np.random.rand(1000, 1000)
>> timeit np.nansum(arr)
100 loops, best of 3: 4.79 ms per loop
>> timeit ny.nansum(arr)
1000 loops, best of 3: 1.53 ms per loop

>> arr[arr > 0.5] = np.nan
>> timeit np.nansum(arr)
10 loops, best of 3: 44.5 ms per loop
>> timeit ny.nansum(arr)
100 loops, best of 3: 6.18 ms per loop

>> timeit np.nansum(arr, axis=0)
10 loops, best of 3: 52.3 ms per loop
>> timeit ny.nansum(arr, axis=0)
100 loops, best of 3: 12.2 ms per loop

np.nansum makes a copy of the input array and makes a mask (another
copy) and then uses the mask to set the NaNs to zero in the copy. So
not only is nanny faster, but it uses less memory.


More information about the NumPy-Discussion mailing list