[Numpy-discussion] numpy release
Thu Apr 24 09:58:08 CDT 2008
I think a long term strategy needs to be adopted for histogram.
Right now there is a great confusion in what the "bins" keyword
does. Right now it is defined as the lower edge of each bin, meaning
that the last bin is open ended and [inf,bin0> does not exist. While
this may not be the right thing to fix in 1.1.0, I would really like to
see it fixed somewhere down the line.
On Apr 24, 2008, at 10:28 AM, Pauli Virtanen wrote:
> Wed, 23 Apr 2008 16:20:41 -0400, David Huard wrote:
>> I haven't found a way to fix histogram reliably without breaking the
>> current behavior. There is a patch attached to the ticket, if the
>> decision is to break histogram.
> Summary of the facts (again...):
> a) histogram's docstring does not match its behavior wrt
> discarding data
This is an easy fix and should definitively go into 1.1.0 :)
> b) given variable-width bins, histogram(..., normed=True)
> the results are wrong
Also a quick fix that should be part of 1.1.0
> c) it might make more sense to handle discarding data in some
> other way than what histogram does now
I would like to see this, but it does not have to happen in 1.1.0 :)
> I think there are now a couple of choices what to do with this:
> A) Change the semantics of histogram function. Old code using
> will just simply break, maybe in mysterious ways
Not really a satisfactory approach. I really don't mind, even though
it would break
some code of mine. I would rather see a better function and have to do
code changes, than the current confusion. Other people will likely
> B) Rename the bins parameter to bin_edges or something else, so that
> any old code using histogram immediately raises an exception that is
> easily understood.
Given this approach bin_edges would contain one more value than bins
that the right edge of the last bin has to be defined.
> C) Create a new parameter with more sensible behavior and a name
> different from "bins", and deprecate (at least giving sequences to)
> "bins" parameter: put up a DeprecationWarning if the user does this,
> still produce the same results as the old histogram. This way the user
> can forward-port her code at leisure.
I think this is probably the best approach to accommodate everyone.
> So which one (or something else) do we choose for 1.1.0?
> Pauli Virtanen
More information about the Numpy-discussion