[Numpy-discussion] bug in stats.randint

josef.pktd@gmai... josef.pktd@gmai...
Thu Apr 23 09:28:21 CDT 2009


On Thu, Apr 23, 2009 at 9:56 AM,  <josef.pktd@gmail.com> wrote:
> On Thu, Apr 23, 2009 at 9:27 AM, Flavio Coelho <fccoelho@gmail.com> wrote:
>>
>> Hi,
>>
>> I stumbled upon something I think is a bug in scipy:
>>
>> In [4]: stats.randint(1.,15.).ppf([.1,
>> .2,.3,.4,.5])
>> Out[4]: array([ 2.,  3.,  5.,  6.,  7.])
>>
>> When you pass float arguments to stats.randint and then call the ppf method,
>> you get an array of floats, which clearly wrong. The rvs method doesn't
>> display this bug so I think is a matter of poor type checking in the ppf
>> implementation...
>>
>
> I switched to using floats intentionally, to have correct handling of
> inf and nans. and the argument checking is generic for all discrete
> distributions and not special cased. Nans are converted to zero when
> casting to integers, which is wrong and very confusing. inf raise an
> exception. I prefer correct numbers to correct types. see examples
> below
>
> If you don't have inf and nans you can cast them to int yourself.
>
> Josef
>
>>>> aint = np.zeros(5,dtype=int)
>>>> aint[0]= np.nan
>>>> aint
> array([0, 0, 0, 0, 0])
>>>> aint[1]= np.inf
> Traceback (most recent call last):
>  File "<pyshell#134>", line 1, in <module>
> OverflowError: cannot convert float infinity to long
>
>>>> from scipy import stats
>>>> stats.poisson.ppf(1,1)
> inf
>>>> stats.poisson.ppf(2,1)
> nan
>
>>>> stats.poisson.ppf(1,1).astype(int)
> -2147483648
>>>> aint[2] = stats.poisson.ppf(1,1)
> Traceback (most recent call last):
>  File "<pyshell#140>", line 1, in <module>
> OverflowError: cannot convert float infinity to long
>

There are still some corner cases in the distributions that might not
or do not work correctly. So please report anything that looks
suspicious.

I wasn't sure what the limiting behavior of ppf is, but in this
example ppf still looks good, but cdf is restricted to long integers.

Josef

>>> stats.poisson.ppf(1-1e-15,1e10)
10000794155.0
>>> stats.poisson.ppf(1-1e-15,1e10).astype(int)
-2147483648

this also works:
>>> stats.poisson.ppf(0.5,1e8).astype(int)
100000000
>>> stats.poisson.cdf(100000000,1e8)
0.50002670277569705

but this doesn't
>>> stats.poisson.ppf(0.5,1e10)
9999999996.0
>>> stats.poisson.cdf(9999999996,1e10)
nan


More information about the Numpy-discussion mailing list