[Scipy-tickets] [SciPy] #1484: "Kulsinski" dissimilarity seems wrong
SciPy Trac
scipy-tickets@scipy....
Sun Aug 7 13:35:19 CDT 2011
#1484: "Kulsinski" dissimilarity seems wrong
---------------------------+------------------------------------------------
Reporter: muellner | Owner: peridot
Type: defect | Status: new
Priority: normal | Milestone: Unscheduled
Component: scipy.spatial | Version: 0.9.0
Keywords: |
---------------------------+------------------------------------------------
Comment(by warren.weckesser):
The problem with kulsinski returning an integer was fixed in
commit:32f9e3d8e0154da1b941.
However, it does appear that the formula is not correct. It looks like
the calculation should be something like (using the variables from the
source code):
1 - 0.5 * ntt * (1.0/u.sum() + 1.0/v.sum())
or equivalently
1 - 0.5 * ntt * (1.0/(ntt + ntf) + 1.0/(nft + ntt))
so, for example, the kulsinksi dissimilarity of [1, 1, 0, 0] and [0, 1, 1,
0] should be 0.5. The current implementation gives:
In [10]: kulsinski([1,1,0,0], [0,1,1,0])
Out[10]: 0.83333333333333337
On the other hand, there are these:
http://www.mothur.org/wiki/Kulczynski
http://www.mothur.org/wiki/Kulczynskicody
A definitive reference would be nice to have.
--
Ticket URL: <http://projects.scipy.org/scipy/ticket/1484#comment:3>
SciPy <http://www.scipy.org>
SciPy is open-source software for mathematics, science, and engineering.
More information about the Scipy-tickets
mailing list