[Scipy-tickets] [SciPy] #620: scipy.stats.distributions.binom.pmf returns incorrect values
SciPy
scipy-tickets@scipy....
Mon Oct 27 12:40:19 CDT 2008
#620: scipy.stats.distributions.binom.pmf returns incorrect values
----------------------------+-----------------------------------------------
Reporter: robertbjornson | Owner: somebody
Type: defect | Status: new
Priority: normal | Milestone: 0.7.0
Component: scipy.stats | Version:
Severity: normal | Resolution:
Keywords: |
----------------------------+-----------------------------------------------
Comment (by josefpktd):
Currently, the pmf is calculated from the discrete difference in the cdf,
which uses a formula in scipy.special.
I briefly looked at the case with using the above pmf formula:
It increases the precision of the calculated probabilities when the number
of trials is small, e.g. n = 10, in the range of 1e-16 (or maybe 1e-13) in
the example.
However, the formula with comb, cannot handle large number of trials e.g.
n > 1100 and produces mostly nans:
{{{
>>> pmfn = bpmf(np.arange(5001),5000, 0.99)
>>> np.sum(np.isnan(pmfn))
4677
>>> pmfn = bpmf(np.arange(10001),10000, 0.99)
>>> np.sum(np.isnan(pmfn))
9704
>>> pmfn[2000:2010]
array([ NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN])
}}}
I don't think 1e-16 or so change in accuracy is important enough to make
this change, since it does not work as a full substitute to the current
version, unless there is a special formula somewhere in scipy that can
calculate the pmf correctly for both small and large numbers of trials.
Josef
--
Ticket URL: <http://scipy.org/scipy/scipy/ticket/620#comment:1>
SciPy <http://www.scipy.org/>
SciPy is open-source software for mathematics, science, and engineering.
More information about the Scipy-tickets
mailing list