[Numpy-discussion] Power distribution

josef.pktd@gmai... josef.pktd@gmai...
Fri Aug 7 17:57:15 CDT 2009


On Fri, Aug 7, 2009 at 6:13 PM, <josef.pktd@gmail.com> wrote:
> On Fri, Aug 7, 2009 at 5:42 PM, <josef.pktd@gmail.com> wrote:
>> On Fri, Aug 7, 2009 at 5:25 PM, Andrew Hawryluk<HAWRYLA@novachem.com> wrote:
>>> Hmm ... good point.
>>> It appears to give a probability distribution proportional to x**(a-1),
>>> but I see no good reason why the domain should be limited to [0,1].
>>>
>>> def test(a):
>>>    nums =
>>> plt.hist(np.random.power(a,100000),bins=100,ec='none',fc='#dddddd')
>>>    x = np.linspace(0,1,200)
>>>    plt.plot(x,nums[0][-1]*x**(a-1))
>>>
>>> Andrew
>>>
>>>
>>>
>>>> -----Original Message-----
>>>> From: numpy-discussion-bounces@scipy.org [mailto:numpy-discussion-
>>>> bounces@scipy.org] On Behalf Of alan@ajackson.org
>>>> Sent: 7 Aug 2009 2:49 PM
>>>> To: Discussion of Numerical Python
>>>> Subject: Re: [Numpy-discussion] Power distribution
>>>>
>>>> I don't think that is it, since the one in numpy has a range
>>> restricted
>>>> to the interval 0-1.
>>>>
>>>> Try out hist(np.random.power(5, 1000000), bins=100)
>>>>
>>>> >You might get better results for 'power-law distribution'
>>>> >http://en.wikipedia.org/wiki/Power_law
>>>> >
>>>> >Andrew
>>>> >
>>>> >> -----Original Message-----
>>>> >> From: numpy-discussion-bounces@scipy.org [mailto:numpy-discussion-
>>>> >> bounces@scipy.org] On Behalf Of alan@ajackson.org
>>>> >> Sent: 7 Aug 2009 11:45 AM
>>>> >> To: Discussion of Numerical Python
>>>> >> Subject: [Numpy-discussion] Power distribution
>>>> >>
>>>> >> Documenting my way through the statistics modules in numpy, I ran
>>>> >> into the Power Distribution.
>>>> >>
>>>> >> Anyone know what that is? I Googled for it, and found a lot of
>>> stuff
>>>> >on
>>>> >> electricity, but no reference for a statistical distribution of
>>> that
>>>> >> name. Does it have a common alias?
>>>> >>
>>>> >> --
>>
>>
>> same is in Travis' notes on the distribution and scipy.stats.distributions
>> domain in [0,1], but I don't know anything about it either
>>
>> ## Power-function distribution
>> ##   Special case of beta dist. with d =1.0
>>
>> class powerlaw_gen(rv_continuous):
>>    def _pdf(self, x, a):
>>        return a*x**(a-1.0)
>>    def _cdf(self, x, a):
>>        return x**(a*1.0)
>>    def _ppf(self, q, a):
>>        return pow(q, 1.0/a)
>>    def _stats(self, a):
>>        return a/(a+1.0), a*(a+2.0)/(a+1.0)**2, \
>>               2*(1.0-a)*sqrt((a+2.0)/(a*(a+3.0))), \
>>               6*polyval([1,-1,-6,2],a)/(a*(a+3.0)*(a+4))
>>    def _entropy(self, a):
>>        return 1 - 1.0/a - log(a)
>> powerlaw = powerlaw_gen(a=0.0, b=1.0, name="powerlaw",
>>                        longname="A power-function",
>>                        shapes="a", extradoc="""
>>
>> Power-function distribution
>>
>> powerlaw.pdf(x,a) = a*x**(a-1)
>> for 0 <= x <= 1, a > 0.
>> """
>>                        )
>>
>
>
> it looks like it's the same distribution, even though it doesn't use
> the random numbers from the numpy function
>
> high p-values with Kolmogorov-Smirnov, see below
>
> I assume it is a truncated version of *a* powerlaw distribution, so
> that a can be large, which would be impossible in the open domain
> case. But a quick search, I only found powerlaw applications that
> refer to the tail behavior.
>
> Josef
>
>>>> rvs = np.random.power(5, 100000)
>>>> stats.kstest(rvs,'powerlaw',(5,))
> (0.0021079715221341555, 0.76587118275752697)
>>>> rvs = np.random.power(5, 1000000)
>>>> stats.kstest(rvs,'powerlaw',(5,))
> (0.00063983013407076239, 0.80757958281509501)
>>>> rvs = np.random.power(0.5, 1000000)
>>>> stats.kstest(rvs,'powerlaw',(0.5,))
> (0.00081823148457027539, 0.51478478398950211)
>

I found a short reference in Johnson, Kotz, Balakrishnan vol. 1 where
it is refered to as the "power-function" distribution.
roughly: if X is pareto (which kind) distributed, then Y=X**(-1) is
distributed according to the power-function distribution. JKB have an
extra parameter in there and is a bit more general then the scipy
version, or maybe it is just the scale parameter included in the
density function.

It is also in NIST data plot, but I didn't find the html reference
page, but only the pdf

http://docs.google.com/gview?a=v&q=cache%3AEgQ6bRkeJl8J%3Awww.itl.nist.gov%2Fdiv898%2Fsoftware%2Fdataplot%2Frefman2%2Fauxillar%2Fpowpdf.pdf+power-function+distribution&hl=en&gl=ca&pli=1

the pdf-files for powpdf and powcdf  are here
http://www.itl.nist.gov/div898/software/dataplot/refman2/auxillar/homepage.htm


I can look some more a bit later tonight.

Josef


More information about the NumPy-Discussion mailing list