[Scipy-tickets] [SciPy] #620: scipy.stats.distributions.binom.pmf returns incorrect values

SciPy scipy-tickets@scipy....
Mon Oct 27 12:40:19 CDT 2008

#620: scipy.stats.distributions.binom.pmf returns incorrect values
 Reporter:  robertbjornson  |        Owner:  somebody
     Type:  defect          |       Status:  new     
 Priority:  normal          |    Milestone:  0.7.0   
Component:  scipy.stats     |      Version:          
 Severity:  normal          |   Resolution:          
 Keywords:                  |  
Comment (by josefpktd):

 Currently, the pmf is calculated from the discrete difference in the cdf,
 which uses a formula in scipy.special.

 I briefly looked at the case with using the above pmf formula:

 It increases the precision of the calculated probabilities when the number
 of trials is small, e.g. n = 10, in the range of 1e-16 (or maybe 1e-13) in
 the example.

 However, the formula with comb, cannot handle large number of trials e.g.
 n > 1100 and produces mostly nans:

 >>> pmfn = bpmf(np.arange(5001),5000, 0.99)
 >>> np.sum(np.isnan(pmfn))
 >>> pmfn = bpmf(np.arange(10001),10000, 0.99)
 >>> np.sum(np.isnan(pmfn))
 >>> pmfn[2000:2010]
 array([ NaN,  NaN,  NaN,  NaN,  NaN,  NaN,  NaN,  NaN,  NaN,  NaN])

 I don't think 1e-16 or so change in accuracy is important enough to make
 this change, since it does not work as a full substitute to the current
 version, unless there is a special formula somewhere in scipy that can
 calculate the pmf correctly for both small and large numbers of trials.


Ticket URL: <http://scipy.org/scipy/scipy/ticket/620#comment:1>
SciPy <http://www.scipy.org/>
SciPy is open-source software for mathematics, science, and engineering.

More information about the Scipy-tickets mailing list