[SciPy-Dev] Faster implementation of cluster.hierarchy
Wed Oct 12 06:12:18 CDT 2011
A mathematician at Stanford named Daniel Müllner recently came up with a
package that implements the hierarchical clustering methods found in
scipy.cluster.hierarchy. His implementation is in C++, but includes a
python API that uses the same interface as scipy.cluster.hierarchy.
Müllner has posted benchmarks as well as algorithmic explanations of why his
implementation is faster in a paper on arXiv<http://arxiv.org/abs/1109.2378>.
He also has a webpage that describes the package
Because the results of the benchmarks look good, I am interested in getting
the scikit-learn package to use this implementation for the hierarchical
clustering provided by that package. Rather than integrate the code in
scikit-learn, it seems more appropriate to integrate it upstream in
scipy.cluster.hierarchy. Is there anyone who is interested in this
integration? I am inexperienced with integrating C++ code and python code,
and also with how things work in the scipy project, so I'm not sure how to
Note: Although Müllner's code is currently under a GPL license, he has
stated to me in e-mail that he would be willing to put it under the BSD-2
license it somebody put the time to integrate it into scipy.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the SciPy-Dev