[SciPy-Dev] Cover trees for nearest neighbors in general metric space
Charles R Harris
Sat Mar 10 17:17:34 CST 2012
On Sat, Mar 10, 2012 at 3:50 PM, Jacob VanderPlas <
> I should add to the discussion here the work I've been doing lately on
> Ball Tree. You can find my progress in github.com/jakevdp/pyDistances.
> The goal of the project is to allow all the metrics from
> scipy.spatial.distances to be used with Ball Tree.
> To this end, I've implemented all the metrics in C/cython, and written a
> "DistMetrics" class wrapper that exposes python hooks into these
> functions through a pdist/cdist method. The speed is comparable to
> scipy.spatial.distance.pdist / cdist for c-ordered arrays, and I'm
> nearly ready to push a big set of commits which will allow these methods
> to support csr-format sparse arrays as well.
> Once I have this machinery place, I plan to extend the BallTree class to
> work for both sparse and dense inputs, and support any of the distance
> metrics currently in scipy.spatial.
> I mainly had scikit-learn in mind for these enhancements, but it may fit
> in scipy.spatial as well. Thoughts?
I assume all these indexes have different advantages, with no single
algorithm being the 'best' for all applications. In that situation I don't
see any reason not to have three versions of spatial indexing available.
The original kd-tree seems to have been generally useful to many, so this
doesn't look like an esoteric research area and having a common interface
should also make it easy to try the different approaches.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the SciPy-Dev