[Numpy-discussion] numpy release

Pauli Virtanen pav@iki...
Thu Apr 24 09:28:39 CDT 2008


Wed, 23 Apr 2008 16:20:41 -0400, David Huard wrote:
> 2008/4/23, Stéfan van der Walt <stefan@sun.ac.za>:
> > Of those tickets, the following are serious:
> >
> > http://projects.scipy.org/scipy/numpy/ticket/605 (a patch is
> > available?, David Huard)
> >   Fixing of histogram.
>
> I haven't found a way to fix histogram reliably without breaking the
> current behavior. There is a patch attached to the ticket, if the
> decision is to break histogram.

Summary of the facts (again...):

  a) histogram's docstring does not match its behavior wrt
     discarding data

  b) given variable-width bins, histogram(..., normed=True)
     the results are wrong

  c) it might make more sense to handle discarding data in some
     other way than what histogram does now

I think there are now a couple of choices what to do with this:

  A) Change the semantics of histogram function. Old code using histogram 
will just simply break, maybe in mysterious ways.

  B) Rename the bins parameter to bin_edges or something else, so that 
any old code using histogram immediately raises an exception that is 
easily understood.

  C) Create a new parameter with more sensible behavior and a name 
different from "bins", and deprecate (at least giving sequences to) the 
"bins" parameter: put up a DeprecationWarning if the user does this, but 
still produce the same results as the old histogram. This way the user 
can forward-port her code at leisure.

  D) Or, retain the old behavior (values below lowest bin ignored) and 
just fix the docstring and the normed=True bug? (I have a patch doing 
this.)


So which one (or something else) do we choose for 1.1.0?

-- 
Pauli Virtanen



More information about the Numpy-discussion mailing list