[Numpy-discussion] numpy histogram normed=True (bug / confusing behavior)
Zbyszek Szmek
zbyszek@in.waw...
Sat Aug 28 04:12:02 CDT 2010
Hi,
On Fri, Aug 27, 2010 at 06:43:26PM -0600, Charles R Harris wrote:
> On Fri, Aug 27, 2010 at 2:47 PM, Robert Kern <robert.kern@gmail.com>
> wrote:
>
> On Fri, Aug 27, 2010 at 15:32, David Huard <david.huard@gmail.com>
> wrote:
> > Nils and Joseph,
> > Thanks for the bug report, this is now fixed in SVN (r8672).
>
> While we're at it, can we change the name of the argument? "normed"
> has caused so much confusion over the years. We could deprecate
> normed=True in favor of pdf=True or density=True.
I think it might be a good moment to also include a different type of normalization:
n = n / n.sum()
i.e. the frequency of counts in each bin. This one is of course very simple to calculate
by hand, but very common. I think it would be useful to have this normalization
available too. [http://www.itl.nist.gov/div898/handbook/eda/section3/histogra.htm]
At the same time, 'normed' could be changed to 'normalized', I think that this
is the standard spelling.
The new API could be
histogram(a, bins=10, range=None,
normed=False, # deprecated
normalized=None, # new
weights=None)
and one would pass normalized='density' to get the current behaviour of normed=True,
and normalized='frequency' to get the behaviour described above.
Best,
Zbyszek
>
> We may even want to consider leaving the normed=True implementation
> alone with just the deprecation warning. While the behavior is
> incorrect, it is also very long-standing and something that people
> might simply have coded around. Changing the behavior without
> deprecation might break their workarounds silently. I admit it's a bit
> of a stretch, but conservativeness and coupled with the opportunity to
> make a desirable name change make this more attractive.
>
> I think that's a good approach. One possibility is do have density
> override normed, something like
>
> if density is not None:
> flowers and unicorns
> else:
> same old same old
>
> Chuck
More information about the NumPy-Discussion
mailing list