# [Numpy-discussion] Does np.std() make two passes through the data?

josef.pktd@gmai... josef.pktd@gmai...
Mon Nov 22 12:51:10 CST 2010

```On Mon, Nov 22, 2010 at 1:39 PM, Keith Goodman <kwgoodman@gmail.com> wrote:
> On Mon, Nov 22, 2010 at 10:32 AM,  <josef.pktd@gmail.com> wrote:
>> On Mon, Nov 22, 2010 at 1:26 PM, Keith Goodman <kwgoodman@gmail.com> wrote:
>>> On Mon, Nov 22, 2010 at 9:03 AM, Keith Goodman <kwgoodman@gmail.com> wrote:
>>>
>>>> @cython.boundscheck(False)
>>>> @cython.wraparound(False)
>>>> def nanstd_twopass(np.ndarray[np.float64_t, ndim=1] a, int ddof):
>>>>    "nanstd of 1d numpy array with dtype=np.float64 along axis=0."
>>>>    cdef Py_ssize_t i
>>>>    cdef int a0 = a.shape[0], count = 0
>>>>    cdef np.float64_t asum = 0, a2sum=0, amean, ai, da
>>>>    for i in range(a0):
>>>>        ai = a[i]
>>>>        if ai == ai:
>>>>            asum += ai
>>>>            count += 1
>>>>    if count > 0:
>>>>        amean = asum / count
>>>>        asum = 0
>>>>        for i in range(a0):
>>>>            ai = a[i]
>>>>            if ai == ai:
>>>>                da = ai - amean
>>>>                asum += da
>>>>                a2sum += (da * da)
>>>>        asum = asum * asum
>>>>        return sqrt((a2sum - asum / count) / (count - ddof))
>>>>    else:
>>>>        return np.float64(NAN)
>>>
>>> This is 5% faster:
>>>
>>> @cython.boundscheck(False)
>>> @cython.wraparound(False)
>>> def nanstd_1d_float64_axis0_2(np.ndarray[np.float64_t, ndim=1] a, int ddof):
>>>    "nanstd of 1d numpy array with dtype=np.float64 along axis=0."
>>>    cdef Py_ssize_t i
>>>    cdef int a0 = a.shape[0], count = 0
>>>    cdef np.float64_t asum = 0, amean, ai
>>>    for i in range(a0):
>>>        ai = a[i]
>>>        if ai == ai:
>>>            asum += ai
>>>            count += 1
>>>    if count > 0:
>>>        amean = asum / count
>>>        asum = 0
>>>        for i in range(a0):
>>>            ai = a[i]
>>>            if ai == ai:
>>>                ai -= amean
>>>                asum += (ai * ai)
>>>        return sqrt(asum / (count - ddof))
>>>    else:
>>>        return np.float64(NAN)
>>
>> I think it would be better to write nanvar instead of nanstd and take
>> the square root only in a delegating nanstd, instead of the other way
>> around. (Also a change that should be made in scipy.stats)
>
> Yeah, I noticed that numpy does that. I was planning to have separate
> var and std functions. Here's why (from the readme file, but maybe I
> should template it, the sqrt automatically converts large ddof to
> NaN):

I'm not sure what you are saying, dropping the squareroot in the
function doesn't require nan handling in the inner loop. If you want
to return nan when count-ddof<=0, then you could just replace

if count > 0:
...

by
if count -ddof > 0:
...

Or am I missing the point?

Josef

>
> Under the hood Nanny uses a separate Cython function for each
> combination of ndim, dtype, and axis. A lot of the overhead in
> ny.nanmax, for example, is in checking that your axis is within range,
> converting non-array data to an array, and selecting the function to
> use to calculate nanmax.
>
> You can get rid of the overhead by doing all this before you, say,
> enter an inner loop:
>
>>>> arr = np.random.rand(10,10)
>>>> axis = 0
>>>> func, a = ny.func.nanmax_selector(arr, axis)
>>>> func.__name__
> 'nanmax_2d_float64_axis0'
>
> Let's see how much faster than runs:
>
>>> timeit np.nanmax(arr, axis=0)
> 10000 loops, best of 3: 25.7 us per loop
>>> timeit ny.nanmax(arr, axis=0)
> 100000 loops, best of 3: 5.25 us per loop
>>> timeit func(a)
> 100000 loops, best of 3: 2.5 us per loop
>
> Note that func is faster than the Numpy's non-nan version of max:
>
>>> timeit arr.max(axis=0)
> 100000 loops, best of 3: 3.28 us per loop
>
> So adding NaN protection to your inner loops has a negative cost!
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
```