[SciPy-user] Pb with numpy.histogram

LB berthe.loic@gmail....
Thu Sep 27 05:09:17 CDT 2007


   Hi,

I've got strange results with numpy.histogram :

Here is its doc strings :
"""
Help on function histogram in module numpy.lib.function_base:

histogram(a, bins=10, range=None, normed=False)
    Compute the histogram from a set of data.

    :Parameters:
      - `a` : array
        The data to histogram. n-D arrays will be flattened.
      - `bins` : int or sequence of floats, optional
        If an int, then the number of equal-width bins in the given
range.
        Otherwise, a sequence of the lower bound of each bin.
      - `range` : (float, float), optional
        The lower and upper range of the bins. If not provided, then
(a.min(),
        a.max()) is used. Values outside of this range are allocated
to the
        closest bin.
      - `normed` : bool, optional
        If False, the result array will contain the number of samples
in each bin.
        If True, the result array is the value of the probability
*density*
        function at the bin normalized such that the *integral* over
the range
        is 1. Note that the sum of all of the histogram values will
not usually
        be 1; it is not a probability *mass* function.

    :Returns:
      - `hist` : array (n,)
        The values of the histogram. See `normed` for a description of
the
        possible semantics.
      - `lower_edges` : float array (n,)
        The lower edges of each bin.
"""

and here is a snipplet of code :
>>> r = random.normal(8, 2, 500)
>>> r.min(), r.max()
(1.164117097856284, 13.069426390055149)
>>> ra
(3, 12)
>>> pdf, xpdf = histogram(r, nbins, range=ra, normed=False)
>>> pdf
array([ 1,  6,  5,  8, 30, 39, 53, 55, 61, 50, 45, 42, 32, 26, 17,
27])
>>> pdf.sum()
497

It seems I've lost 3 of my 500 random numbers !

>>> r[ r>= ra[1]]
array([ 12.00676288,  12.8381615 ,  12.48380931,  12.55392835,
        12.26153469,  12.92869504,  12.58290343,  12.03782311,
        13.06942639,  12.06375346,  12.02970414,  12.53556779,
        12.54203654,  12.02611864,  12.85113934,  12.64692817])

>>> r[ r<= ra[0]]
array([ 1.1641171 ,  2.85873306,  2.92046745])

So this number match the number of experiments below the range given
to histogram.
This smells like a bug to me.
Is there something I've misunderstood in the utilisation of
numpy.histogram ?

For information
>>> numpy.__version__
'1.0.2'

Regards,

--
LB



More information about the SciPy-user mailing list