[SciPy-User] KDTree count_neighbors

Anne Archibald aarchiba@physics.mcgill...
Fri Jul 2 19:07:06 CDT 2010

On 2 July 2010 16:35, Keith Goodman <kwgoodman@gmail.com> wrote:
> On Fri, Jul 2, 2010 at 1:25 PM, Anne Archibald
> <aarchiba@physics.mcgill.ca> wrote:
>> The count_neighbors function is designed to compare close neighbors
>> between two trees, not between one tree and one point or one tree and
>> one array of points. I think the fastest way to get what you want is
>> to just query for the neighbors and take the length of the resulting
>> list. You could also try putting your query point (or points) into a
>> KDtreel; if you have many query points this will be much faster.
> Oh, that sounds interesting. So I build one tree for the data and
> another tree for the query points? (For each query point I want to
> find the k nearest neighbors in the data tree). What's the next step?

Looking at the code, I see that much of this is not implemented, unfortunately.

The compiled version (much faster and more space-efficient) supports
only querying of arrays of points.

The python version (more flexible) has several additional algorithms
implemented; in particular it has query_ball_point, which finds all
neighbours within a radius no matter how many there are, and
query_ball_tree, which does the same thing but accepts a tree as the
second argument and takes advantage of it to accelerate the algorithm
(from something like m log(n) to something like log(m)log(n)). Because
the python tree supports annotations more naturally, I also
implemented a two-tree neighbour-counting algorithm.

What isn't there is what I think you were hoping for: an algorithm
that takes two trees and finds the k nearest neighbours in one of each
point in the other. The reason it's not there is not just because I
didn't bother to implement it; it's there because you don't gain the
same benefits from such an algorithm - since each point always has k
neighbours, you're stuck with at least m log(n) behaviour. There is
some tree traversal that can sometimes be avoided, since points near
to each other may have similar sets of neighbours (or not), but it
would take quite a sophisticated and clever algorithm to take full
advantage of it.


More information about the SciPy-User mailing list