[Numpy-discussion] searchsorted() and memory cache

Charles R Harris charlesr.harris@gmail....
Fri May 9 00:11:48 CDT 2008

On Thu, May 8, 2008 at 11:06 PM, Charles R Harris <charlesr.harris@gmail.com>

> On Thu, May 8, 2008 at 10:30 PM, Charles R Harris <
> charlesr.harris@gmail.com> wrote:
>> On Thu, May 8, 2008 at 9:55 PM, Robert Kern <robert.kern@gmail.com>
>> wrote:
>>> On Thu, May 8, 2008 at 10:51 PM, Andrew Straw <strawman@astraw.com>
>>> wrote:
>>> > I've got a big element array (25 million int64s) that searchsorted()
>>> > takes a long time to grind through. After a bit of digging in the
>>> > literature and the numpy source code, I believe that searchsorted() is
>>> > implementing a classic binary search,
>>> Yes.
>>> > which is pretty bad in terms of
>>> > cache misses. There are several modern implementations of binary search
>>> > which arrange items in memory such that cache misses are much more
>>> rare.
>>> > Clearly making such an indexing arrangement would take time, but in my
>>> > particular case, I can spare the time to create an index if searching
>>> > was faster, since I'd make the index once but do the searching many
>>> times.
>>> >
>>> > Is there an implementation of such an algorithm that works easilty with
>>> > numpy? Also, can you offer any advice, suggestions, and comments to me
>>> > if I attempted to implement such an algorithm?
>>> I'm no help. You seem to know more than I do. Sadly, the first few
>>> Google hits I get for "binary search minimize cache misses" are
>>> patents. I don't know what the substantive content of those patents
>>> are; I have a strict policy of not reading patents.
>> I would be interested in adding such a thing if it wasn't patent
>> encumbered. A good start would be a prototype in python to show how it all
>> went together and whether it needed a separate indexing/lookup function or
>> could be fit into the current setup.
> One way I can think of doing this is to have two indices. One is the usual
> sorted list, the second consists of, say, every 1024'th entry in the first
> list. Then search the second list first to find the part of the first list
> to search. That won't get you into the very best cache, but it could buy you
> a factor of 2x-4x in speed. It's sort of splitting the binary tree into two
> levels.

You could even do it with your data from python, generating the second list
and calling searchsorted multiple times. If you are searching for a bunch of
values, it's probably good to also sort them first so they bunch together in
the same part of the big list.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://projects.scipy.org/pipermail/numpy-discussion/attachments/20080508/68dbe453/attachment.html 

More information about the Numpy-discussion mailing list