[SciPy-User] generic expectation operator
josef.pktd@gmai...
josef.pktd@gmai...
Wed Aug 22 18:48:56 CDT 2012
On Wed, Aug 22, 2012 at 5:03 PM, <josef.pktd@gmail.com> wrote:
> On Wed, Aug 22, 2012 at 4:49 PM, nicky van foreest <vanforeest@gmail.com> wrote:
>> Hi,
>>
>> For numerous purposes I need to compute the expectation of some
>> function with respect to some probability distribution. i.e. E f(X) =
>> \int f(x) dG(x), where f is the function and G the distribution. When
>> G is continuous, this is simple, just use scipy.integrate.quad.
>> However, if G is discrete, things become more complicated since quad
>> often misses the spikes in the density. To deal with this problem, in
>> part, I came up with the following code, but it is not completely ok
>> yet, see below.
>>
>> #!/usr/bin/env python
>>
>> import scipy.stats as stats
>> import scipy.integrate
>> from math import sqrt
>>
>> def E(f, X):
>> if hasattr(X,'dist'):
>> if isinstance(X.dist, stats.rv_discrete): # hint from Ralph
>> g = X.pmf
>> else:
>> g = X.pdf
>> return scipy.integrate.quad(lambda x: x*g(x), X.dist.a, X.dist.b)
>> else:
>> return sum(f(x)*p for x,p in zip(X.xk, X.pk))
>>
>> # now the following works:
>> X = stats.norm(3,sqrt(3))
>> print E(sqrt, X)
I don't know why this works, sqrt of a normal distributed random
variable has lots of nans (or complex numbers)
>>> f = lambda x: np.sqrt(np.maximum(x,0))
>>> stats.norm.expect(f, loc=3, scale=np.sqrt(3))
1.6394037171420484
>>> f = lambda x: np.sqrt(np.abs(x))
>>> stats.norm.expect(f, loc=3, scale=np.sqrt(3))
1.6708150951924068
>>
>> # this is ok too
>> grid = [50, 100, 150, 200, 250, 300]
>> probs=[2,3,2,1.,0.5,0.5]
>> tot = sum(probs)
>> probs = [p/tot for p in probs]
>> X = stats.rv_discrete(values=(grid,probs), name='sizeDist')
>> print E(sqrt, X)
>>
>> # but this is not ok
>>
>> X = stats.poisson(3)
>> print E(sqrt, X)
>>> stats.poisson.expect(np.sqrt, args=(3,))
1.6309535375094115
Josef
>>
>> ------------------------------------------
>>
>> This is the output:
>>
>> 11.0383463182
>> (7.771634331185016e-19, 1.916327339868129e-17)
>> (2.9999999999999996, 1.1770368678494054e-08)
>>
>> So indeed, quad misses the spikes in the poisson distribution.
>>
>> Is there any nice way to resolve this? Moreover, is there a better way
>> to build the expectation operator? Perhaps, if all works, it would be
>> a useful add-on to distributions.py, or else I can include it in the
>> distribution tutorial, if other people also think this is useful.
>
> quad works only for continuous functions
>
> did you look at
>
>>>> stats.poisson.expect
> <bound method poisson_gen.expect of
> <scipy.stats.distributions.poisson_gen object at 0x0320FC90>>
>>>> stats.norm.expect
> <bound method norm_gen.expect of <scipy.stats.distributions.norm_gen
> object at 0x031EB890>>
>
> improvements welcome, especially making it more robust
>
> Josef
>
>>
>> thanks
>>
>> Nicky
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User@scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
More information about the SciPy-User
mailing list