[Numpy-discussion] Median again

Andrew Straw strawman@astraw....
Tue Jan 29 16:07:07 CST 2008


Considering that many of the statistical functions (mean, std, median)
must iterate over all the data and that people (or at least myself)
typically call them sequentially on the same data, it may make sense to
make a super-function with less repetition.

Instead of:
x_mean = np.mean(x)
x_median = np.median(x)
x_std = np.std(x)
x_min = np.min(x)
x_max = np.max(x)

We do:
x_stats = np.get_descriptive_stats(x,
stats=['mean','median','std','min','max'],axis=-1)
And x_stats is a dictionary with 'mean','meadian','std','min', 'max' keys.

The implementation could reduce the number of iterations over the data
in this case. The implementation wouldn't have to be optimized
initially, but could be gradually sped up once the interface is in
place. I bring this up now to suggest such an idea as a more-general
alternative to the "medianwithaxis" function proposed. What do you
think? (Perhaps something like this already exists?) And, finally, this
all surely belongs in scipy, but we already have stuff in numpy that
can't be removed without seriously breaking backwards compatibility...

-Andrew

Matthew Brett wrote:
> Hi,
>   
>>>> median moved mediandim0
>>>> implementation of medianwithaxis or similar, with same call
>>>> signature as mean.
>>>>
>>>> Deprecation warning for use of median, and return of mediandim0 for
>>>> now.  Eventual move of median to return medianwithaxis.
>>>>         
>>> This would confuse people even more, I'm afraid. First they're said
>>> that median() is deprecated, and then later on it becomes the standard
>>> function to use. I would actually prefer a short pain rather than a
>>> long one.
>>>       
>
> I was thinking the warning could be something like:
>
> "The current and previous version of numpy use a version of median
> that is not consistent with other summary functions such as mean.  The
> calling convention of median will change in a future version of numpy
> to match that of the other summary functions.  This compatible future
> version is implemented as medianwithaxis, and will become the default
> implementation of median.  Please change any code using median to call
> medianwithaxis specifically, to maintain compatibility with future
> numpy APIs."
>
>   
>> I would certainly like median to take the axis keyword. The axis
>> keyword (and its friends) could be added to 1.0.5 with the default
>> being 1 instead of None, so that it keeps compatibility with the 1.0
>> API. Then, with 1.1 (an API-breaking release) the default can be
>> changed to None to restore consistency with mean, etc.
>>     
>
> But that would be very surprising to a new user, and might lead to
> some hard to track down silent bugs at a later date.
>
> Matthew
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>   



More information about the Numpy-discussion mailing list