[Numpy-discussion] Coverting ranks to a Gaussian

Robert Kern robert.kern@gmail....
Mon Jun 9 18:45:08 CDT 2008

On Mon, Jun 9, 2008 at 18:34, Keith Goodman <kwgoodman@gmail.com> wrote:
> Does anyone have a function that converts ranks into a Gaussian?
> I have an array x:
>>> import numpy as np
>>> x = np.random.rand(5)
> I rank it:
>>> x = x.argsort().argsort()
>>> x_ranked = x.argsort().argsort()
>>> x_ranked
>   array([3, 1, 4, 2, 0])

There are subtleties in computing ranks when ties are involved. Take a
look at the implementation of scipy.stats.rankdata().

> I would like to convert the ranks to a Gaussian without using scipy.

No dice. You are going to have to use scipy.special.ndtri somewhere. A
basic transformation (off the top of my head, I have no idea if this
is statistically meaningful):

  scipy.special.ndtri((ranks + 1.0) / (len(ranks) + 1.0))

Barring tied first or last items, this should give equal weight to
each of the tails outside of the range of your data.

Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
 -- Umberto Eco

More information about the Numpy-discussion mailing list