[Numpy-discussion] Coverting ranks to a Gaussian
Mon Jun 9 21:06:24 CDT 2008
On Mon, Jun 9, 2008 at 4:45 PM, Robert Kern <firstname.lastname@example.org> wrote:
> On Mon, Jun 9, 2008 at 18:34, Keith Goodman <email@example.com> wrote:
>> Does anyone have a function that converts ranks into a Gaussian?
>> I have an array x:
>>>> import numpy as np
>>>> x = np.random.rand(5)
>> I rank it:
>>>> x = x.argsort().argsort()
>>>> x_ranked = x.argsort().argsort()
>> array([3, 1, 4, 2, 0])
> There are subtleties in computing ranks when ties are involved. Take a
> look at the implementation of scipy.stats.rankdata().
Good point. I had to deal with ties and missing data. I bet
scipy.stats.rankdata() is faster than my implementation.
>> I would like to convert the ranks to a Gaussian without using scipy.
> No dice. You are going to have to use scipy.special.ndtri somewhere. A
> basic transformation (off the top of my head, I have no idea if this
> is statistically meaningful):
> scipy.special.ndtri((ranks + 1.0) / (len(ranks) + 1.0))
> Barring tied first or last items, this should give equal weight to
> each of the tails outside of the range of your data.
Nice. Thank you. It passes the never wrong chi-by-eye test:
r = np.arange(1000)
g = special.ndtri((r + 1.0) / (len(r) + 1.0))
I wasn't able to use scipy.special.ndtri (after import scipy) like you
did. I had to do (but I'm new to scipy)
from scipy import special
from Debian Lenny.
More information about the Numpy-discussion