[SciPy-User] cluster.hierarchy.fcluster: Choosing value for threshold t?

Pundurs Mark (Nokia-LC/Chicago) mark.pundurs@nokia....
Wed Apr 25 12:39:52 CDT 2012


I have ~8000 observations of 1-D data (not a standard use case for cluster.hierarchy, I suspect). I send the output of linkage to fcluster; if I choose a t of 1.5, the output array has only one unique value, and if I choose t=0.5, the output array has as many unique values as does the input data. I can see from dendrogram that there are intermediate levels of clustering; apart from trial and error, how do I find a t that returns such a clustering?

(Does the vertical axis of the dendrogram have anything to do with fcluster's t argument? In the dendrogram, clusters split at integer values of the vertical component.)

Mark Pundurs
Data Analyst - Traffic
Location & Commerce
Chicago


The information contained in this communication may be CONFIDENTIAL and is intended only for the use of the recipient(s) named above.  If you are not the intended recipient, you are hereby notified that any dissemination, distribution, or copying of this communication, or any of its contents, is strictly prohibited.  If you have received this communication in error, please notify the sender and delete/destroy the original message and any copy of it from your computer or paper files.


More information about the SciPy-User mailing list