[Numpy-discussion] numpy histogram normed=True (bug / confusing behavior)

Sebastian Haase seb.haase@gmail....
Sun Aug 29 16:06:54 CDT 2010


On Sun, Aug 29, 2010 at 3:21 PM, Nils Becker <n.becker@amolf.nl> wrote:
>> On Sat, Aug 28, 2010 at 04:12, Zbyszek Szmek <zbyszek@in.waw.pl> wrote:
>>> Hi,
>>>
>>> On Fri, Aug 27, 2010 at 06:43:26PM -0600, Charles R Harris wrote:
>>>> ? ?On Fri, Aug 27, 2010 at 2:47 PM, Robert Kern <robert.kern@gmail.com>
>>>> ? ?wrote:
>>>>
>>>> ? ? ?On Fri, Aug 27, 2010 at 15:32, David Huard <david.huard@gmail.com>
>>>> ? ? ?wrote:
>>>> ? ? ?> Nils and Joseph,
>>>> ? ? ?> Thanks for the bug report, this is now fixed in SVN (r8672).
>>>>
>>>> ? ? ?While we're at it, can we change the name of the argument? "normed"
>>>> ? ? ?has caused so much confusion over the years. We could deprecate
>>>> ? ? ?normed=True in favor of pdf=True or density=True.
>>> I think it might be a good moment to also include a different type of normalization:
>>> ? ? ? n = n / n.sum()
>>> i.e. the frequency of counts in each bin. This one is of course very simple to calculate
>>> by hand, but very common. I think it would be useful to have this normalization
>>> available too. [http://www.itl.nist.gov/div898/handbook/eda/section3/histogra.htm]
>>
>> My feeling is that this is trivial to do "by hand". I do not see a
>> reason to add an option to histogram() to do this.
>>
> Hi,
>
> +1 for not silently changing the behavior of normed=True. (I'm one of
> the people who have worked around it).
>
> One argument in favor of putting both normalizing styles 'frequency' and
> 'density' may be that the documentation will automatically become very
> clear. A user sees all options and there is little chance of a
> misunderstanding. Of course, a sentence like "If you want frequency
> normalization, use histogram(data, normalized=False)/sum(data)" would
> also make things clear, without adding the frequency option.
>
I am in favor of adding an option for the density mode (not for this
release I guess).
I often have a long expressing in place of `data` and the one extra
keyword saves lot's of typing.

-Sebastian Haase


More information about the NumPy-Discussion mailing list