[Scipy-tickets] [SciPy] #479: Output of scipy.stats.rv_discrete does not match input due to indexing problem

SciPy scipy-tickets@scipy....
Wed Aug 15 05:28:10 CDT 2007


#479: Output of scipy.stats.rv_discrete does not match input due to indexing
problem
---------------------+------------------------------------------------------
 Reporter:  mnan     |       Owner:  somebody
     Type:  defect   |      Status:  new     
 Priority:  normal   |   Milestone:          
Component:  Other    |     Version:  0.5.2   
 Severity:  blocker  |    Keywords:          
---------------------+------------------------------------------------------
 When rv_discrete accepts zero probabilities (2nd value in kwarg 'values')
 the indexing of the output sample is erroneously affected (see below).

 The problem code might be in rv_discrete.__fix_loc:
 {{{
         if loc is None:
             loc = 0
 }}}
 i.e. if the value is None, the location of the returned value becomes the
 zeroth value, and by placing the zeros at the front, non-zero
 probabilities get shuffled to the end.

 This is a problem because the output does not accurately reflect the
 distribution that is being sampled from.

 {{{
 from __future__ import division
 from scipy.stats import rv_discrete

 # looks at output of scipy.stats.rv_discrete

 STATES = [0,1,2,3]
 SIZE = 10000

 def count(inpt):
     opt = dict(zip( STATES, (0,0,0,0) ))
     for i in inpt:
         opt[i]+=1
     while opt:
         k,v = opt.popitem()
         print k, ' : ', v/SIZE # probability should reflect that of the
 input distribution

 def bugdemo():
     test0 = rv_discrete( name='sample', values=( STATES, [ 0.3, 0.4, 0.2,
 0.1 ] ) ).rvs( size=SIZE )
     count(test0)
     print 'Output approximately matches input in the correct order: the
 problem only occurs when zeros are included in the initial distribution
 (see below)';print
     test1 = rv_discrete( name='sample', values=( STATES, [ 0.5, 0.4, 0,
 0.1 ] ) ).rvs( size=SIZE )
     count(test1)
     print 'State 1 and State 2 have been mixed up.'; print
     test2 = rv_discrete( name='sample', values=( STATES, [ 0.6, 0, 0, 0.4
 ] ) ).rvs( size=SIZE )
     count(test2)
     print 'State 2 is sampled with the probability that State 0 should be
 sampled.'; print
     test3 = rv_discrete( name='sample', values=( STATES, [ 0, 1.0, 0, 0 ]
 ) ).rvs( size=SIZE )
     count(test3)
     print 'State 3 appears to have the probability State 1 should have.'

 if __name__=='__main__':
     bugdemo()
 }}}

-- 
Ticket URL: <http://projects.scipy.org/scipy/scipy/ticket/479>
SciPy <http://www.scipy.org/>
SciPy is open-source software for mathematics, science, and engineering.


More information about the Scipy-tickets mailing list