[Numpy-discussion] Warnings in numpy.ma.test()

josef.pktd@gmai... josef.pktd@gmai...
Wed Mar 17 14:18:01 CDT 2010


On Wed, Mar 17, 2010 at 3:12 PM, Christopher Barker
<Chris.Barker@noaa.gov> wrote:
> Eric Firing wrote:
>> My motivation for going
>> to the C level was speed and control; many ma operations are very slow
>> compared to their numpy counterparts, and moving the mask handling to C
>> can erase nearly all of this penalty.
>
> really? very cool. I was thinking about this the other day, and thinking
> that in some grand future vision, all numpy arrays should be masked
> arrays (or could be). The idea is that missing/invalid data is a really
> common case, and it is simply wonderful to have the software handle that
> gracefully.
>
> One of the things I liked about MATLAB was that NaNs were well handled
> almost all the time. Given all the limitations of NaN, having a masked
> array is a better way to go, but I'd love it if they were "just there",
> and therefore EVERY numpy function and package built on numpy would
> handle them gracefully. I had thought that there would be a significant
> performance penalty, and thus there would be a boatload of "if_mask"
> code all over the place, but maybe not.

many function are defined differently for missing values, in stats,
regression or time series analysis with the assumption of equally
spaced time periods always needs to use special methods to handle
missing values.

Plus, you have to operate on two arrays and keep both in memory. So
the penalty is pretty high even in C.
(on the statsmodels mailing list, Wes did a comparison for different
implementations of moving average, although the difference wouldn't be
as huge as it currently is.)

Josef


>
> Anyway, just a fantasy, but C-level ufuncs that support masks would be
> great.
>
> -Chris
>
>
>
>
>
> --
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R            (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
>
> Chris.Barker@noaa.gov
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


More information about the NumPy-Discussion mailing list