[Numpy-discussion] NA masks for NumPy are ready to test
Mark Wiebe
mwwiebe@gmail....
Wed Aug 24 21:29:44 CDT 2011
On Wed, Aug 24, 2011 at 6:09 PM, Wes McKinney <wesmckinn@gmail.com> wrote:
<snip>
>
> - Performance with skipna is a bit disappointing:
>
> In [52]: arr = np.random.randn(1e6)
> In [54]: arr.flags.maskna = True
> In [56]: arr[::2] = np.NA
> In [58]: timeit arr.sum(skipna=True)
> 100 loops, best of 3: 7.31 ms per loop
>
> this goes down to 2.12 ms if there are no NAs present.
>
> but:
>
> In [59]: import bottleneck as bn
> In [60]: arr = np.random.randn(1e6)
> In [61]: arr[::2] = np.nan
> In [62]: timeit bn.nansum(arr)
> 1000 loops, best of 3: 1.17 ms per loop
>
> do you have a sense if this gap can be closed? I assume you've been,
> as you should, focused on a correct implementation as opposed with
> squeezing out performance.
>
It looks like the spdiv example module I created for the C-API documentation
can give a bit of an idea for some performance expectations. The example has
no specialization for strides, and it operates exactly like np.divide except
it converts the output to NA instead of dividing by zero. It *always*
creates an NA mask for the output, and does a masked loop. Here's a link to
the example module:
https://github.com/m-paradox/spdiv
In [1]: from spdiv_mod import spdiv
In [2]: arr = np.random.randn(1e6)
Since spdiv always creates an NA mask, this is comparing an NA-masked divide
with a regular NumPy divide:
In [3]: timeit spdiv(arr, 3.1)
100 loops, best of 3: 13.8 ms per loop
In [4]: timeit arr / 3.1
10 loops, best of 3: 11.4 ms per loop
Here, the divide is causing an NA mask to be created in the output, just
like in spdiv:
In [5]: timeit spdiv(arr, np.NA)
100 loops, best of 3: 4.72 ms per loop
In [6]: timeit arr / np.NA
100 loops, best of 3: 8.71 ms per loop
Here are the same tests, but after giving 'arr' an NA mask:
In [7]: arr.flags.maskna = True
In [8]: timeit spdiv(arr, 3.1)
100 loops, best of 3: 14.2 ms per loop
In [9]: timeit arr / 3.1
10 loops, best of 3: 20.1 ms per loop
In [10]: timeit spdiv(arr, np.NA)
100 loops, best of 3: 4.02 ms per loop
In [11]: timeit arr / np.NA
100 loops, best of 3: 8.69 ms per loop
Another thought is to compare sum to count_nonzero, which is implemented in
a straightforward fashion without the masked wrapping mechanism that's in
the ufuncs.
n [12]: arr[::2] = np.NA
In [13]: np.count_nonzero(arr)
Out[13]: NA(dtype='int64')
In [14]: np.count_nonzero(arr, skipna=True)
Out[14]: 500000
In [15]: timeit np.count_nonzero(arr, skipna=True)
100 loops, best of 3: 5.86 ms per loop
In [16]: timeit np.sum(arr, skipna=True)
10 loops, best of 3: 16.1 ms per loop
In [17]: timeit np.count_nonzero(arr, skipna=False)
100 loops, best of 3: 1.85 ms per loop
In [18]: timeit np.sum(arr, skipna=False)
100 loops, best of 3: 1.86 ms per loop
Cheers,
Mark
>
> best,
> Wes
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20110824/45f7cf35/attachment-0001.html
More information about the NumPy-Discussion
mailing list