[Numpy-discussion] 3-10x speedup in bincount()
Wed Mar 14 08:35:55 CDT 2007
I'd de glad to test your bincount function for histogramming purposes.
2007/3/14, Stephen Simmons <firstname.lastname@example.org>:
> Well, there were no responses to my earlier email proposing changes to
> numpy.bincount() to make it faster and more flexible. Does this mean
> noone is using bincount? :-)
> Anyway I've now made the proposed modifications and got substantial
> speedups of 3-10. Details are in this extract from my C code. For the
> time being I will test this as a standalone extension module. When
> stable and tests are written, I'll submit as a patch to the numpy's
> * Faster versions of bincount and casting strings to integers
> * Author: Stephen Simmons, email@example.com
> * Date: 11 March 2007
> * This module contains C code for functions I am using to accelerate
> * SQL-like aggregate functions for a column-oriented database based on
> * subtotal's bincount is typically 3-10 times faster than numpy's
> * and more flexible
> * - takes bins as they are, without promoting their type to int32
> * - takes weights as they are, without promoting their type to double
> * - ignores negative bins, rather than reporting an error
> * - allows optional int32/double output array to be passed in
> * - specify maximum bin number to use. Larger bins numbers are ignored
> * - only scans for max bin number if neither max_bin nor out array
> * atoi is 30-60 times faster than casting a string array to an integer
> * and may optionally use a translation table. The translation table is
> * a convenient way to map sparse integer strings to adjacent bins before
> * using bincount.
> * Typical usage
> * -------------
> # Set up arrays 5,000,000 elts long. s1 has strings filled with '1' and
> # w are weights
> >>> s1 = numpy.ndarray(shape=5000000, dtype='S1'); s1[:]='2'; s1[1::2]='1'
> >>> w = numpy.arange(5000000,dtype='f4')
> # Using standard numpy functions, string conversion is slow
> >>> i = s1.astype('i1') -> 5.95 sec (!)
> >>> numpy.bincount(i) -> 113 ms
> >>> numpy.bincount(i, w) -> 272 ms
> >>> numpy.bincount(s1.astype('i1'), w) -> 6.12 sec
> # Using the faster functions here:
> >>> i = subtotal.atoi(s1) -> 90 ms (60x faster)
> >>> subtotal.bincount(i, max_bin=2) -> 31 ms (3.6x faster)
> >>> subtotal.bincount(i, w, max_bin=2) -> 51 ms (5.3x faster)
> # In both bases, output is
> array([2, 1, 2, ..., 1, 2, 1], dtype=int8)
> array([ 0, 2500000, 2500000])
> array([ 0.00000000e+00, 6.25000000e+12, 6.24999750e+12])
> # As an example of using a translate table, run bincount from atoi
> # a translate table for counting odd vs even strings. This does the
> whole lot
> # in 158ms, a speedup of 38x from 6.12 s above.
> >>> t = numpy.arange(256, dtype='i1') % 2
> >>> subtotal.bincount(subtotal.atoi(s1, t), w, max_bin=t.max()) ->
> 158 ms
> array([ 6.24999750e+12, 6.25000000e+12])
> * These timings are based on numpy-1.0.1 Windows binary distribution
> * for Python 2.5, compared with building this code using standard
> * without any particular compiler optimisations:
> C:\mnt\dev\subtotal> python setup.py build -cmingw32
> * Processor is a 1.83GHz Pentium-M
> << rest of C code omitted >>
> Stephen Simmons wrote:
> > Hi,
> > I'd like to propose some minor modifications to the function
> > bincount(arr, weights=None), so would like some feedback from other
> > uses of bincount() before I write this up as a proper patch, .
> > Background:
> > bincount() has two forms:
> > - bincount(x) returns an integer array ians of length max(x)+1 where
> > ians[n] is the number of times n appears in x.
> > - bincount(x, weights) returns a double array dans of length max(x)+1
> > where dans[n] is the sum of elements in the weight vector weights[i]
> > at the positions where x[i]==n
> > In both cases, all elements of x must be non-negative.
> > Proposed changes:
> > (1) Remove the restriction that elements of x must be non-negative.
> > Currently bincount() starts by finding max(x) and min(x). If the min
> > value is negative, an exception is raised. This change proposes
> > dropping the initial search for min(x), and instead testing for
> > non-negativity while summing values in the return arrays ians or dans.
> > Any indexes where where x is negative will be silently ignored. This
> > will allow selective bincounts where values to ignore are flagged with
> > a negative bin number.
> > (2) Allow an optional argument for maximum bin number.
> > Currently bincount(x) returns an array whose length is dependent on
> > max(x). It is sometimes preferable to specify the exact size of the
> > returned array, so this change would add a new optional argument,
> > max_bin, which is one less than the size of the returned array. Under
> > this change, bincount() starts by finding max(x) only if max_bin is
> > not specified. Then the returned array ians or dans is created with
> > length max_bin+1, and any indexes that would overflow the output array
> > are silently ignored.
> > (3) Allow an optional output array, y.
> > Currently bincount() creates a new output array each time. Sometimes
> > it is preferable to add results to an existing output array, for
> > example, when the input array is only available in smaller chunks, or
> > for a progressive update strategy to avoid fp precision problems when
> > adding lots of small weights to larger subtotals. Thus we can add an
> > extra optional argument y that bypasses the creation of an output array.
> > With these three change, the function signature of bincount() would
> > become:
> > bincount(x, weights=None, y=None, max_bin=None)
> > Anyway, that's the general idea. I'd be grateful for any feedback
> > before I code this up as a patch to _compiled_base.c.
> > Cheers
> > Stephen
> Numpy-discussion mailing list
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Numpy-discussion