# [SciPy-User] incomplete beta function B(x; k+1, 0) ?

josef.pktd@gmai... josef.pktd@gmai...
Sun Jun 27 11:51:05 CDT 2010

```On Sat, Jun 26, 2010 at 6:56 PM, Pauli Virtanen <pav@iki.fi> wrote:
> Sat, 26 Jun 2010 17:59:09 -0400, josef.pktd wrote:
>> Is there a incomplete beta function with a zero argument  B(x; k+1, 0)
>>  available in scipy?
>>
>>>>> special.betainc( np.arange(5)+1, 0, 0.5)
>> array([ NaN,  NaN,  NaN,  NaN,  NaN])
>
>>>> print special.betainc.__doc__
> betainc(x1, x2, x3[, out])
> y=betainc(a,b,x) returns the incomplete beta integral of the
> arguments, evaluated from zero to x: gamma(a+b) / (gamma(a)*gamma(b))
> * integral(t**(a-1) (1-t)**(b-1), t=0..x).
>
> So the function in Scipy has an additional normalization prefactor. The
> prefactor however seems to be zero for b=0, so this function in Scipy
> probably can't easily be used for finding the value of the integral at
> b=0.
>
> But if you just set b=1e-99 (or anything smaller than the machine
> epsilon) and divide the prefactor away, you should end up with the result
> you are looking for.
>
> At b=0 the integral is logarithmically divergent towards x->1, and a
> finite b > 0 cuts this off around x ~ 1 - exp(-1/b) --- so it shouldn't
> matter. The rest of the integrand should also saturate at b ~ machine
> epsilon.
>
> But apparently, there's something fishy in the betainc algorithm for x
> close to 1 and b close to 0, for example this discontinuity:
>
>        >>> sc.betainc(1, 1.000002e-6, 1 - 1e-6)
>        1.3815442755027441e-05
>        >>> sc.betainc(1, 1.000001e-6, 1 - 1e-6)
>        1.2397031262149499e-05
>
> So while things in principle are OK, you shouldn't probably trust things
> beyond x > 1 - 1e-3.

Thanks, it works this way, including low precision for x close to 1

>>> a=np.arange(20)
>>> b=1.000002e-16
>>> x=gamma(a+b) / (gamma(a)*gamma(b))

>>> p=0.5; (1 + special.betainc(np.arange(20)+1, 1.000002e-16, p)/x/np.log(1-p))[1:] - stats.logser.cdf(np.arange(1,20),p)
array([  0.00000000e+00,   0.00000000e+00,   0.00000000e+00,
-1.11022302e-16,  -1.11022302e-16,  -1.11022302e-16,
0.00000000e+00,   0.00000000e+00,   0.00000000e+00,
0.00000000e+00,   0.00000000e+00,   0.00000000e+00,
0.00000000e+00,   0.00000000e+00,   0.00000000e+00,
1.11022302e-16,   1.11022302e-16,   1.11022302e-16,
1.11022302e-16])

>>> p=1-1e-6; (1 + special.betainc(np.arange(20)+1, 1.000002e-16, p)/x/np.log(1-p))[1:] - stats.logser.cdf(np.arange(1,20),p)
array([ 0.10246633,  0.10226642,  0.10206728,  0.10186891,  0.1016713 ,
0.10147445,  0.10127835,  0.101083  ,  0.10088839,  0.10069451,
0.10050137,  0.10030896,  0.10011726,  0.09992629,  0.09973603,
0.09954648,  0.09935764,  0.09916949,  0.09898204])
>>> p=1-1e-3; (1 + special.betainc(np.arange(20)+1, 1.000002e-16, p)/x/np.log(1-p))[1:] - stats.logser.cdf(np.arange(1,20),p)
array([  1.16573418e-14,  -1.38777878e-16,  -6.77236045e-15,
9.88098492e-15,   3.65263375e-14,  -3.55271368e-15,
-1.98729921e-14,   1.66533454e-16,   3.55271368e-15,
1.42663659e-14,   7.10542736e-15,  -1.14908083e-14,
2.05946371e-14,   1.94289029e-15,   1.62647673e-14,
1.17683641e-14,  -2.16493490e-15,   2.55351296e-15,
1.35447209e-14])
>>> p=1-1e-4; (1 + special.betainc(np.arange(20)+1, 1.000002e-16, p)/x/np.log(1-p))[1:] - stats.logser.cdf(np.arange(1,20),p)
array([  3.78329849e-06,   3.70895520e-06,   3.63619061e-06,
3.56496856e-06,   3.49525364e-06,   3.42701179e-06,
3.36020887e-06,   3.29481244e-06,   3.23079046e-06,
3.16811184e-06,   3.10674608e-06,   3.04666343e-06,
2.98783512e-06,   2.93023291e-06,   2.87382897e-06,
2.81859665e-06,   2.76450961e-06,   2.71154229e-06,
2.65966984e-06])

So, I guess this is not really worth it since the generic calculations
is just a sum of relatively simpler terms, which only in the case of p
close to 1 has a larger number of terms to sum up.

Josef

>
> --
> Pauli Virtanen
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
```