[Scipy-tickets] [SciPy] #1218: Very poor relative accuracy in discrete survival functions

SciPy Trac scipy-tickets@scipy....
Thu Jul 1 19:28:31 CDT 2010


#1218: Very poor relative accuracy in discrete survival functions
---------------------+------------------------------------------------------
 Reporter:  dsimcha  |       Owner:  somebody
     Type:  defect   |      Status:  new     
 Priority:  normal   |   Milestone:  0.8.0   
Component:  Other    |     Version:  0.7.0   
 Keywords:           |  
---------------------+------------------------------------------------------
 Because rv_discrete in distributions.py simply uses 1.0 - cdf as its
 default implementation of a survival function, the relative accuracy in
 the tails of the survival function for any distribution that doesn't
 override this is appallingly terrible.  Here's an example:

 R:

 {{{
 > phyper(20000, 99000, 110000, 39000, lower.tail = FALSE)
 [1] 2.752693e-66
 }}}


 SciPy  (Note that R and SciPy use different parametrizations of the
 hypergeometric distribution):


 {{{
 >>> from scipy.stats import *
 >>> hypergeom.sf(20000, 99000 + 110000, 99000, 110000)
 1.0
 >>> hypergeom.sf(20000, 99000 + 110000, 99000, 39000)
 -1.6360179877494829e-10
 }}}


 SciPy (Using alternative, manual implementation):


 {{{
 >>> result = 0
 >>> for x in xrange(20001, 39001):
 ...     result += hypergeom.pmf(x, 99000 + 110000, 99000, 39000)
 ...
 >>>
 >>> result
 2.752692949998141e-66
 }}}

-- 
Ticket URL: <http://projects.scipy.org/scipy/ticket/1218>
SciPy <http://www.scipy.org>
SciPy is open-source software for mathematics, science, and engineering.


More information about the Scipy-tickets mailing list