[SciPy-Dev] Cover trees for nearest neighbors in general metric space

Charles R Harris charlesr.harris@gmail....
Sat Mar 10 17:17:34 CST 2012


On Sat, Mar 10, 2012 at 3:50 PM, Jacob VanderPlas <
vanderplas@astro.washington.edu> wrote:

> Hi,
> I should add to the discussion here the work I've been doing lately on
> Ball Tree.  You can find my progress in github.com/jakevdp/pyDistances.
> The goal of the project is to allow all the metrics from
> scipy.spatial.distances to be used with Ball Tree.
>
> To this end, I've implemented all the metrics in C/cython, and written a
> "DistMetrics" class wrapper that exposes python hooks into these
> functions through a pdist/cdist method.  The speed is comparable to
> scipy.spatial.distance.pdist / cdist for c-ordered arrays, and I'm
> nearly ready to push a big set of commits which will allow these methods
> to support csr-format sparse arrays as well.
>
> Once I have this machinery place, I plan to extend the BallTree class to
> work for both sparse and dense inputs, and support any of the distance
> metrics currently in scipy.spatial.
>
> I mainly had scikit-learn in mind for these enhancements, but it may fit
> in scipy.spatial as well.  Thoughts?
>

I assume all these indexes have different advantages, with no single
algorithm being the 'best' for all applications. In that situation I don't
see any reason not to have three versions of spatial indexing available.
The original kd-tree seems to have been generally useful to many, so this
doesn't look like an esoteric research area and having a common interface
should also make it easy to try the different approaches.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/scipy-dev/attachments/20120310/2f1e5f5d/attachment.html 


More information about the SciPy-Dev mailing list