Mon Jul 27 15:29:51 CDT 2009
On Mon, Jul 27, 2009 at 3:55 PM, Robert Kern<firstname.lastname@example.org> wrote:
> On Mon, Jul 27, 2009 at 14:47, Pierre GM<email@example.com> wrote:
>> I'm puzzled by this result:
>> >>> from scipy.stats.distributions import nbinom
>> >>> nbinom(.3,.15).ppf(nbinom(.3,.15).cdf(np.arange(20)))
>> array([ 1., 2., 2., 3., 5., 5., 6., 8., 9., 10., 11.,
>> 12., 13., 14., 15., 16., 17., 18., 19., 20.])
>> I would have naturally expected np.arange(20).
> Floating point shenanigans. The CDF and PPF of discrete distributions
> are step functions. Round-tripping involves evaluating the PPF around
> the step. Naturally, floating point errors are going to nudge you to
> the left or right of that step.
>> Using np.round instead of np.ceil in nbinom_gen._ppf seems to solve
>> the issue.
>> Is there any reason not to do it ?
> ceil() is correct; round() is not. round() would be okay if the only
> inputs are expected to be outputs of the CDF, but one frequently needs
> the PPF to take all values in [0,1], like for example doing random
> number generation via inversion.
Do you think it would be useful to round if we are epsilon (?) close
to the next integer? It is more likely that users have a case like
Pierre's where the answer might be the closest integer, instead of an
epsilon below. It would move the floating point error, but to an, in
actual usage, less likely location.
I think, it is in scipy.special where I saw some code that treats
anything that is within 1e-8 (?) of an integer as the integer.
> Robert Kern
> "I have come to believe that the whole world is an enigma, a harmless
> enigma that is made terrible by our own mad attempt to interpret it as
> though it had an underlying truth."
> -- Umberto Eco
> Scipy-dev mailing list
More information about the Scipy-dev