[SciPy-user] return type of inverse cdf of discrete distribution?

joep josef.pktd@gmail....
Sat Oct 25 13:03:38 CDT 2008


What should be the return type of the inverse cdf (and inverse
survival function of a discrete distribution?

The problem is handling of inf for boundary and nans for invalid
input. Options are
* return floating point (double) with inf and nans returned as for the
continuous distribution, or
* return integer and throw an exception if return values are inf or
nans (or restricting to open interval (0,1).

Currently, scipy.stats returns integers (long), but the treatment is
not consistent, e.g. instead of nans, zeros are returned for invalid
input and inf on boundary throws casting error.

I just checked in R:
continuous distribution: inverse cdf returns nans and infs, e.g.
> qnorm(c(0.5,1.0,2.0), 0, 25)
[1]   0 Inf NaN

discrete distribution in VGAM: only accept values in (0,1): e.g.

> qpospois(c(0.5,1.0,2.0), 25)
Error in qpospois(c(0.5, 1, 2), 25) : bad input for argument "p"
> qpospois(1.0, 25)
Error in qpospois(1, 25) : bad input for argument "p"
> qpospois(0.0, 25)
Error in qpospois(0, 25) : bad input for argument "p"
> qpospois(c(0.0000001,0.5,0.999999999), 25)
[1]  4 25 60
>

however in stats package in R: no domain checking, and returns nans
and inf

> aa=qpois(c(0.5,1.0,2.0), 25)
Warning message:
In qpois(p, lambda, lower.tail, log.p) : NaNs produced
> aa
[1]  25 Inf NaN

> typeof(aa)
[1] "double"
> aap=qpospois(c(0.5,0.99), 25)
> typeof(aap)
[1] "double"
> aa1=qpois(0.5, 25)
> typeof(aa1)
[1] "double"

both VGAM and stats in R return double.

Changing the return type in scipy.stats discrete distribution to
double would be a break in API, I don't know if this is relevant or if
anybody cares.

An alternative would be to choose the return type depending on the
presence of nans or infs, but that might not be very reliable for
applications.

Josef


More information about the SciPy-user mailing list