Tue Aug 16 00:08:54 CDT 2011
#1484: "Kulsinski" dissimilarity seems wrong
Reporter: muellner
Type: defect | Status: new
Priority: normal | Milestone: Unscheduled
Component: scipy.spatial | Version: 0.9.0
Comment(by muellner):
The formula in the second link which warren.weckesser gave (it's called
Kulczynski-Cody index there) agrees with my two links. This formula has
the advantage that it agrees with R, so it would not confuse users.
But I agree with warren.weckesser that a definitive reference is
desirable. Unfortunately, it's in Polish and from 1928. Let's follow back:
Henning and Hausdorf (doi: 10.1080/10635150500481523) cite Shi
(doi:10.1016/0031-0182(93)90084-V). This author mentions two similarity
(sic) measures, both attributed to Kulczynski. The reference is
Kulczynski, S., 1928, Zespoly róslin w Pieninach. Bull. Int. Acad. Pol.
Sci. Lettres, Sér. B, Suppl., 2:57-203
I haven't read the Polish paper. But according to Shi's paper, the
measures are
"Kulczynksi unnamed 1" = 1 - (dissimilarity measure from warren.weckessers
first link)
"Kulczynksi unnamed 2" = 1 - (dissimilarity measure from the other three
links)
I vote for the second version, since more people appear to use this one.
warren.weckesser already stated it: 1 - 0.5 * ntt * (1.0/u.sum() +
1.0/v.sum())
