[Numpy-discussion] numpy histogram normed=True (bug / confusing behavior)

Zbyszek Szmek zbyszek@in.waw...
Sat Aug 28 04:12:02 CDT 2010


Hi,

On Fri, Aug 27, 2010 at 06:43:26PM -0600, Charles R Harris wrote:
>    On Fri, Aug 27, 2010 at 2:47 PM, Robert Kern <robert.kern@gmail.com>
>    wrote:
> 
>      On Fri, Aug 27, 2010 at 15:32, David Huard <david.huard@gmail.com>
>      wrote:
>      > Nils and Joseph,
>      > Thanks for the bug report, this is now fixed in SVN (r8672).
> 
>      While we're at it, can we change the name of the argument? "normed"
>      has caused so much confusion over the years. We could deprecate
>      normed=True in favor of pdf=True or density=True.
I think it might be a good moment to also include a different type of normalization:
       n = n / n.sum()
i.e. the frequency of counts in each bin. This one is of course very simple to calculate
by hand, but very common. I think it would be useful to have this normalization
available too. [http://www.itl.nist.gov/div898/handbook/eda/section3/histogra.htm]

At the same time, 'normed' could be changed to 'normalized', I think that this
is the standard spelling.

The new API could be
       histogram(a, bins=10, range=None,
       		 normed=False,                 # deprecated
                 normalized=None,              # new
                 weights=None)

and one would pass normalized='density' to get the current behaviour of normed=True,
and normalized='frequency' to get the behaviour described above.

Best,
Zbyszek

> 
>      We may even want to consider leaving the normed=True implementation
>      alone with just the deprecation warning. While the behavior is
>      incorrect, it is also very long-standing and something that people
>      might simply have coded around. Changing the behavior without
>      deprecation might break their workarounds silently. I admit it's a bit
>      of a stretch, but conservativeness and coupled with the opportunity to
>      make a desirable name change make this more attractive.
> 
>    I think that's a good approach. One possibility is do have density
>    override normed, something like
> 
>    if density is not None:
>        flowers and unicorns
>    else:
>        same old same old
> 
>    Chuck


More information about the NumPy-Discussion mailing list