[Numpy-discussion] How to efficiently reduce a 1-dim 100-10000 element array with user defined binary function

Charles R Harris charlesr.harris@gmail....
Fri Nov 14 21:00:49 CST 2008


On Fri, Nov 14, 2008 at 6:22 PM, Kim Hansen <slaunger@gmail.com> wrote:

> Dear numpy-discussion,
>
> I am quite new to Python and numpy.
>
> I am working on a Python application using Scipy, where I need to
> unpack and pack quite large amounts of data (GBs) into data structures
> and convert them into other data structures. Until now the program has
> been running amazingly efficiently (processing 10-125 MB/s depending
> on the task) because I have managed to use only built-in array
> functions for all computationally intensive operations (and beause i
> have fairly good hardware to run it on).
>
> However, I have run into problems now, as I need to compute new
> checksums for modified data structures using a ones complement add on
> one-dimensional, uint16 arrayrs with 50-10000 elements.
>
> The best procedure I have come up with yet is the following implementation
>
> def complement_add_uint16(a, b):
>    """
>    Returns the complement ones addition of two unsigned 16 bit integers
>
>    The values are added and if the carry is set, the value is incremented.
>
>    It is assumed that a and b are both in the range [0; 0xFFFF].
>    """
>    c = a + b
>    c += c >> 16
>    return c & 0xFFFF
>
> def complement_add_checksum(ints):
>    """
>    Return a complement checksum based on a
>    one-dimensional array of dtype=uint16
>
>    """
>    return reduce(complement_add_uint16, ints, 0)
>

Can't you add them up in a higher precision, then do a = (a & 0xffff ) + (a
>> 16) until the high order bits are gone? The high order bits will count
the number ones you need to add. Something like

def complement_add_uint16(x) :
    y = sum(x, dtype=uint64)
    while y >= 2**16 :
        y = (y & uint64(0xffff)) + (y >> uint64(16))
    return y

You might need to block the data to avoid overflow of uint64, but you get an
extra 48 bits in any case. Mind, I'm not sure this is exactly the same thing
as what you are trying to do, so it bears checking.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://projects.scipy.org/pipermail/numpy-discussion/attachments/20081114/4ac933db/attachment.html 


More information about the Numpy-discussion mailing list