[Numpy-discussion] Normalized histogram for data ranges 0 .. 1 returns PDF > 1

josef.pktd@gmai... josef.pktd@gmai...
Tue Feb 2 07:44:15 CST 2010


On Tue, Feb 2, 2010 at 8:05 AM, Manos Tsagias <tsagias@gmail.com> wrote:
> Hi all,
>  I'm using numpy.histogram with normed=True with 1D data ranging 0 .. 1. The
> results return probabilities greater than 1. The trapezoidal integral
> returns 1, but I'm afraid this is due to the bin assigned values. Example
> follows:
>>>> from numpy import *
>>>> a = arange(0, 1, 0.1)
>>>> histogram(a, normed=True)
> (array([ 1.11111111,  1.11111111,  1.11111111,  1.11111111,  1.11111111,
>         1.11111111,  1.11111111,  1.11111111,  1.11111111,  1.11111111]),
> array([ 0.  ,  0.09,  0.18,  0.27,  0.36,  0.45,  0.54,  0.63,  0.72,
>         0.81,  0.9 ]))
>  Is that normal? If not, does anyone encountered that before? Ideas welcome!
>  Thanks,
>  Manos._

histogram with normed=True has the interpretation of a pdf of a
continuous random variable not discrete. The pdf of a continuous
distribution can be anything greater or equal zero. On [0,1] it has to
have a part that is larger than 1 unless the distribution is uniform
in order to integrate to 1.

It's a sometimes-asked-question, there are more explanations on the
mailing list.

Josef

>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


More information about the NumPy-Discussion mailing list