[Numpy-tickets] [NumPy] #189: Histograms (1d, 2d, nd)

NumPy numpy-tickets at scipy.net
Mon Oct 23 09:28:34 CDT 2006


#189: Histograms (1d, 2d, nd)
-------------------------+--------------------------------------------------
 Reporter:  dhuard       |        Owner:  oliphant   
     Type:  enhancement  |       Status:  assigned   
 Priority:  normal       |    Milestone:  1.0 Release
Component:  Other        |      Version:  devel      
 Severity:  normal       |   Resolution:             
 Keywords:               |  
-------------------------+--------------------------------------------------
Comment (by dhuard):

 Replying to [comment:5 oliphant]:
 > histogram1d has an axis keyword but only works for 2 dimensions.  It
 should work for an N-dimensional array.

 Done

 > Moving histogram to a compatibility module is more problematic.
 Histogram has always placed out-of range values in the upper and lower
 bins.  Changing this behavior will create issues for some people.

 Currently, only upper out-of-range values are stored, lower outliers are
 not counted at all. Somebody on the list suggested to return a dictionary
 with upper and lower outliers. It is done. I tried to minimize code
 breakage by keeping two return values (hist, dict) and keeping the order
 of calling arguments, so that someone calling {{{histogram(arr, 20)[0]}}}
 won't see any difference.

 However, here is what will break:

 1. '''Second return value'''[[BR]]
 Instead of returning (hist_array, left_edges), histogram now return
 (hist_array, dict).
 The dict contains {'edges':the bin edges (N+1), 'upper': upper outliers,
 'lower': lower outliers, 'bincenters': the bin centers (N).

 2. '''Explicit ranget'''[[BR]]
 Outliers are not included in the histogram array, but stored in the dict.
 This is consequential only if range or bins is given explicitely. Indeed,
 if no range or bins is given, the range is (min, max) so there are no
 outliers.

 Here are the additions:

 1. Support for weighted samples.

 2. Axis argument to compute 1D histogram along a given axis.

 Concerns expressed on the list were that statistical functions should be
 put in scipy, or in a numpy statistical module (Tim Hochberg). What do you
 prefer ? I'll submit a patch accordingly.

-- 
Ticket URL: <http://projects.scipy.org/scipy/numpy/ticket/189#comment:6>
NumPy <http://projects.scipy.org/scipy/numpy>
The fundamental package needed for scientific computing with Python.


More information about the Numpy-tickets mailing list