[SciPy-User] scoreatpercentile behaviour
Thu Jan 24 11:41:09 CST 2013
On Thu, Jan 24, 2013 at 11:59 AM, Andreas Hilboll <firstname.lastname@example.org> wrote:
> I just had a quick look into scipy.stats.scoreatpercentile, and was
> disappointed to see that it's currently not possible to do the
> calculation for more than one percentile at a time (``per`` is scalar).
> So I had a quick look into the sources, and was surprised to see that
> apprently, the function expects ordered input ``a``, which is not noted
> in the docstring. (Or maybe it's just my misunderstanding of the word
> 'percentile'. I had expected the function to work on the input's
> **values**, not on the indices.
I'm not sure what you mean here:
``a`` is sorted by the function, and then we take the n*per smallest
value (roughly, interpolates).
That gives you the quantile value of the input array.
> Is this a bug or a feature? If it's a feature, this should be very
> explicitly noted in the docstring, I think. I'm willing to do so if you
> can confirm that the current behaviour is actually wanted.
> In the sources' TODO, it's stated that a more general percentile
> implementation would be welcome. I might be able to contribute something
> here; any hints on where to start?
there is a pull request that follows the numpy implementation
stats.mstats has different options
stats.mstats.scoreatpercentile and stats.mstats.mquantiles
(I also wrote a draft for a fully vectorized version of it.)
It's one of those function where I don't like the current
implementation much, but don't know what the alternative should be.
For example in statsmodels we also use stats.mstats.mquantiles because
it has interpolation and an axis option.
(So, I'm staying partially on the sidelines on this.)
> SciPy-User mailing list
More information about the SciPy-User