[Numpy-discussion] Does np.std() make two passes through the data?

josef.pktd@gmai... josef.pktd@gmai...
Mon Nov 22 12:32:37 CST 2010


On Mon, Nov 22, 2010 at 1:26 PM, Keith Goodman <kwgoodman@gmail.com> wrote:
> On Mon, Nov 22, 2010 at 9:03 AM, Keith Goodman <kwgoodman@gmail.com> wrote:
>
>> @cython.boundscheck(False)
>> @cython.wraparound(False)
>> def nanstd_twopass(np.ndarray[np.float64_t, ndim=1] a, int ddof):
>>    "nanstd of 1d numpy array with dtype=np.float64 along axis=0."
>>    cdef Py_ssize_t i
>>    cdef int a0 = a.shape[0], count = 0
>>    cdef np.float64_t asum = 0, a2sum=0, amean, ai, da
>>    for i in range(a0):
>>        ai = a[i]
>>        if ai == ai:
>>            asum += ai
>>            count += 1
>>    if count > 0:
>>        amean = asum / count
>>        asum = 0
>>        for i in range(a0):
>>            ai = a[i]
>>            if ai == ai:
>>                da = ai - amean
>>                asum += da
>>                a2sum += (da * da)
>>        asum = asum * asum
>>        return sqrt((a2sum - asum / count) / (count - ddof))
>>    else:
>>        return np.float64(NAN)
>
> This is 5% faster:
>
> @cython.boundscheck(False)
> @cython.wraparound(False)
> def nanstd_1d_float64_axis0_2(np.ndarray[np.float64_t, ndim=1] a, int ddof):
>    "nanstd of 1d numpy array with dtype=np.float64 along axis=0."
>    cdef Py_ssize_t i
>    cdef int a0 = a.shape[0], count = 0
>    cdef np.float64_t asum = 0, amean, ai
>    for i in range(a0):
>        ai = a[i]
>        if ai == ai:
>            asum += ai
>            count += 1
>    if count > 0:
>        amean = asum / count
>        asum = 0
>        for i in range(a0):
>            ai = a[i]
>            if ai == ai:
>                ai -= amean
>                asum += (ai * ai)
>        return sqrt(asum / (count - ddof))
>    else:
>        return np.float64(NAN)

I think it would be better to write nanvar instead of nanstd and take
the square root only in a delegating nanstd, instead of the other way
around. (Also a change that should be made in scipy.stats)

Josef

> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


More information about the NumPy-Discussion mailing list