[Numpy-discussion] searchsorted() and memory cache
Fri May 9 08:36:31 CDT 2008
I don't know if it helps, but Ulrich Drepper had a 9 part series about
memory in Linux Weekly News (http://lwn.net). You can under all 9 linked
under his name in the Guest archives
(http://lwn.net/Archives/GuestIndex/) as not all are linked together.
The first one is:
What every programmer should know about memory, Part 1
<http://lwn.net/Articles/250967/> (September 21, 2007)
In part 5 there was a comment on 'Cache-oblivious algorithms' by *akapoor*:
"i guess it's worth mentioning harald-prokop's 1999 thesis on "cache oblivious algorithms"
Francesc Alted wrote:
> A Friday 09 May 2008, Andrew Straw escrigué:
>> I've got a big element array (25 million int64s) that searchsorted()
>> takes a long time to grind through. After a bit of digging in the
>> literature and the numpy source code, I believe that searchsorted()
>> is implementing a classic binary search, which is pretty bad in terms
>> of cache misses. There are several modern implementations of binary
>> search which arrange items in memory such that cache misses are much
>> more rare. Clearly making such an indexing arrangement would take
>> time, but in my particular case, I can spare the time to create an
>> index if searching was faster, since I'd make the index once but do
>> the searching many times.
>> Is there an implementation of such an algorithm that works easilty
>> with numpy? Also, can you offer any advice, suggestions, and comments
>> to me if I attempted to implement such an algorithm?
> Well, if you can afford extra space for the hashes you can always use a
> dictionary for doing the lookups. In pure Python they are around 3x
> faster (for arrays of 8 millions of elements) than binary searches. If
> your space is tight, you can build an extension (for example in Pyrex)
> for doing binary search for your specific type (int64), for an small
> improvement. Finally, if you combine this approach with what is
> suggesting Charles Harris (i.e. creating several levels of caches, but
> not more than two, which in my experience works best), you can have
> pretty optimal lookups with relatively low space overhead.
> See this thread for a discussion of the binary/hash lookup approaches:
> Hope that helps,
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Numpy-discussion