[SciPy-User] the skellam distribution

Bruce Southey bsouthey@gmail....
Mon Nov 9 09:53:33 CST 2009


On 11/09/2009 08:40 AM, josef.pktd@gmail.com wrote:
> 2009/11/8 Ernest Adrogué<eadrogue@gmx.net>:
>    
>> Hi,
>>
>> In case somebody is interested, or you want to include it
>> in scipy. I used these specs here from the R package:
>> cran.r-project.org/web/packages/skellam/skellam.pdf
>>
>> Note that I am no statician, somebody who knows what he's
>> doing (as opposed to me ;) should verify it's correct.
>>
>>
>> import numpy
>> import scipy.stats.distributions
>>
>> # Skellam distribution
>>
>> ncx2 = scipy.stats.distributions.ncx2
>>
>> class skellam_gen(scipy.stats.distributions.rv_discrete):
>>     def _pmf(self, x, mu1, mu2):
>>         if x<  0:
>>             px = ncx2.pdf(2*mu2, 2*(1-x), 2*mu1)*2
>>         else:
>>             px = ncx2.pdf(2*mu1, 2*(x+1), 2*mu2)*2
>>         return px
>>     def _cdf(self, x, mu1, mu2):
>>         x = numpy.floor(x)
>>         if x<  0:
>>             px = ncx2.cdf(2*mu2, x*(-2), 2*mu1)
>>         else:
>>             px = 1-ncx2.cdf(2*mu1, 2*(x+1), 2*mu2)
>>         return px
>>     def _stats(self, mu1, mu2):
>>         mean = mu1 - mu2
>>         var = mu1 + mu2
>>         g1 = (mu1 - mu2) / numpy.sqrt((mu1 + mu2)**3)
>>         g2 = 1 / (mu1 + mu2)
>>         return mean, var, g1, g2
>> skellam = skellam_gen(a=-numpy.inf, name="skellam", longname='A Skellam',
>>                       shapes="mu1,mu2", extradoc="")
>>
>>      
> Thanks, I think the distribution of the difference of two poisson
> distributed random variables could be useful.
>
> Would you please open an enhancement ticket for this at
> http://projects.scipy.org/scipy/report/1
>
> I had only a brief look at it so far, I had never looked at the
> Skellam distribution before, and just read a few references.
>
> The "if x<  0 .. else ..." will have to be replace with a
> "numpy.where" assignment, since the methods are supposed to work with
> arrays of x (as far as I remember)
>
> _rvs could be implemented directly instead of generically (I don't
> find the reference, where I saw it, right now).
>
> Documentation will be necessary,  a brief description in the
> (currently) extradocs, and a listing of the properties for the
> description of the distributions currently in the stats tutorial.
>
> I have some background questions, which address the limitation of the
> implementation (but are not really necessary for inclusion into
> scipy).
>
> The description in R mentions several implementation of Skellam. Do
> you have a rough idea what the range of parameters are for which the
> implementation using ncx produces good results? Do you know if any
> other special functions would produce good results over a larger
> range, e.g. using Bessel function?
>
> Wikipedia, http://en.wikipedia.org/wiki/Skellam_distribution , also
> mentions (but doesn't describe) the case of Skellam distribution with
> correlated Poisson distributions. Do you know what the difference to
> your implementation would be?
>
> Tests for a new distribution will be picked up by the generic tests,
> but it would be useful to have some extra tests for extreme/uncommon
> parameter ranges. Do you have any comparisons with R, since you
> already looked it?
>
>
> Thanks again, I'm always looking out for new useful distributions,
> (but I have to find the time to do the testing and actual
> implementation).
>
> Josef
>
>
>    
>> Bye.
>>
>> --
>> Ernest
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User@scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>>      
> _______________________________________________
> SciPy-User mailing list
> SciPy-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>    

Generally any R code can not be used in numpy because R is GPL. Usually 
R code is also licensed under GPL  so translation from R to Python/numpy 
still maintains the original license. So the code not used by numpy 
unless that code is licensed under a BSD compatible license.

You *must* show that you implementation is from a BSD-compatible source 
not from the R package. I can see that your code is very simple so there 
should be an viable alternative source.

Also, in the _stats function why do you do not re-use the mean and var 
variables in computing the g1 and g2 variables?

What are 'x, mu1, mu2' ?
This looks like a scalar implementation so you need to either check that 
or allow for array-like inputs.

Bruce




More information about the SciPy-User mailing list