[Numpy-discussion] [ANN] Nanny, faster NaN functions

Wes McKinney wesmckinn@gmail....
Sat Nov 20 17:56:22 CST 2010


On Sat, Nov 20, 2010 at 6:54 PM, Wes McKinney <wesmckinn@gmail.com> wrote:
> On Sat, Nov 20, 2010 at 6:39 PM, Keith Goodman <kwgoodman@gmail.com> wrote:
>> On Fri, Nov 19, 2010 at 7:42 PM, Keith Goodman <kwgoodman@gmail.com> wrote:
>>> I should make a benchmark suite.
>>
>>>> ny.benchit(verbose=False)
>> Nanny performance benchmark
>>    Nanny 0.0.1dev
>>    Numpy 1.4.1
>>    Speed is numpy time divided by nanny time
>>    NaN means all NaNs
>>   Speed   Test                Shape        dtype    NaN?
>>   6.6770  nansum(a, axis=-1)  (500,500)    int64
>>   4.6612  nansum(a, axis=-1)  (10000,)     float64
>>   9.0351  nansum(a, axis=-1)  (500,500)    int32
>>   3.0746  nansum(a, axis=-1)  (500,500)    float64
>>  11.5740  nansum(a, axis=-1)  (10000,)     int32
>>   6.4484  nansum(a, axis=-1)  (10000,)     int64
>>  51.3917  nansum(a, axis=-1)  (500,500)    float64  NaN
>>  13.8692  nansum(a, axis=-1)  (10000,)     float64  NaN
>>   6.5327  nanmax(a, axis=-1)  (500,500)    int64
>>   8.8222  nanmax(a, axis=-1)  (10000,)     float64
>>   0.2059  nanmax(a, axis=-1)  (500,500)    int32
>>   6.9262  nanmax(a, axis=-1)  (500,500)    float64
>>   5.0688  nanmax(a, axis=-1)  (10000,)     int32
>>   6.5605  nanmax(a, axis=-1)  (10000,)     int64
>>  48.4850  nanmax(a, axis=-1)  (500,500)    float64  NaN
>>  14.6289  nanmax(a, axis=-1)  (10000,)     float64  NaN
>>
>> You can also use the makefile to run the benchmark: make bench
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
> Keith (and others),
>
> What would you think about creating a library of mostly Cython-based
> "domain specific functions"? So stuff like rolling statistical
> moments, nan* functions like you have here, and all that-- NumPy-array
> only functions that don't necessarily belong in NumPy or SciPy (but
> could be included on down the road). You were already talking about
> this on the statsmodels mailing list for larry. I spent a lot of time
> writing a bunch of these for pandas over the last couple of years, and
> I would have relatively few qualms about moving these outside of
> pandas and introducing a dependency. You could do the same for larry--
> then we'd all be relying on the same well-vetted and tested codebase.
>
> - Wes
>

By the way I wouldn't mind pushing all of my datetime-related code
(date range generation, date offsets, etc.) into this new library,
too.


More information about the NumPy-Discussion mailing list