[SciPy-user] histogram bug ?

Alan G Isaac aisaac at american.edu
Fri Dec 30 14:11:03 CST 2005


On Fri, 30 Dec 2005, Gary apparently wrote: 
> In [288]: f 
> Out[288]: 
> array([  46.,   59.,   77.,   87.,   50.,   97.,   84.,   73.,  100., 
>          34.,   86.,   67.,   68.,  100.,   74.,   81.,   94.,   66., 
>          52.,   66.,   69.,   54.,   85.,   97.,   31.,   49.]) 
> In [289]: scipy.histogram(f) 
> --------------------------------------------------------------------------- 
> exceptions.TypeError 


It is a scoping problem.  (See comments below.)
This also reminded me of a question:
should a.sort() violate Pythonic expectations by returning a?

Alan Isaac

PS Possible rewrite of `histogram` offered at the end.
Fixes this problem, eliminates the use of the built-in name 
`range`, and sets endpoint=False in the linspace call.
See the end of this message.



################  The Current Def with Problem Highlighted  ###############
def histogram(a, bins=10, range=None, normed=False):
    a = asarray(a).ravel()                                 #<- here `a` is an array
    if not iterable(bins):
        if range is None:
            range = (a.min(), a.max())
        mn, mx = [a+0.0 for a in range]                    #<- now `a` is a number!
        if mn == mx:
            mn -= 0.5
            mx += 0.5
        bins = linspace(mn, mx, bins)

    n = a.sort().searchsorted(bins)                        #<- not caught here because type(a) is int32_arrtype!!
    n = concatenate([n, [len(a)]])
    n = n[1:]-n[:-1]

    if normed:
        db = bins[1] - bins[0]
        return 1.0/(a.size*db) * n, bins
    else:
        return n, bins

################  Proposed Rewrite of histogram  ################
def histogram(a, bins=10, minmax=None, normed=False, 
copy=True):
    '''Returns `n`,`bins` as arrays, where
    `n` contains the number of items in each bin, and
    `bins` contains the bin cutoffs (cutoff<=value)
    '''
    a = array(a,copy=copy).ravel()
    if not iterable(bins):
        if minmax is None:
            minmax = (a.min(), a.max())
        mn, mx = [mi+0.0 for mi in minmax]
        if mn == mx:
            mn -= 0.5
            mx += 0.5
        bins = linspace(mn, mx, bins, endpoint=False)

    n = a.sort().searchsorted(bins)
    n = concatenate([n, [len(a)]])
    n = n[1:]-n[:-1]

    if normed:
        db = bins[1] - bins[0]
        return 1.0/(a.size*db) * n, bins
    else:
        return n, bins





More information about the SciPy-user mailing list