[Scipy-tickets] [SciPy] #1484: "Kulsinski" dissimilarity seems wrong

SciPy Trac scipy-tickets@scipy....
Sun Aug 7 13:35:19 CDT 2011


#1484: "Kulsinski" dissimilarity seems wrong
---------------------------+------------------------------------------------
 Reporter:  muellner       |       Owner:  peridot    
     Type:  defect         |      Status:  new        
 Priority:  normal         |   Milestone:  Unscheduled
Component:  scipy.spatial  |     Version:  0.9.0      
 Keywords:                 |  
---------------------------+------------------------------------------------

Comment(by warren.weckesser):

 The problem with kulsinski returning an integer was fixed in
 commit:32f9e3d8e0154da1b941.

 However, it does appear that the formula is not correct.  It looks like
 the calculation should be something like (using the variables from the
 source code):

    1 - 0.5 * ntt * (1.0/u.sum() + 1.0/v.sum())

 or equivalently

    1 - 0.5 * ntt * (1.0/(ntt + ntf) + 1.0/(nft + ntt))

 so, for example, the kulsinksi dissimilarity of [1, 1, 0, 0] and [0, 1, 1,
 0] should be 0.5.  The current implementation gives:

 In [10]: kulsinski([1,1,0,0], [0,1,1,0])
 Out[10]: 0.83333333333333337

 On the other hand, there are these:
     http://www.mothur.org/wiki/Kulczynski
     http://www.mothur.org/wiki/Kulczynskicody

 A definitive reference would be nice to have.

-- 
Ticket URL: <http://projects.scipy.org/scipy/ticket/1484#comment:3>
SciPy <http://www.scipy.org>
SciPy is open-source software for mathematics, science, and engineering.


More information about the Scipy-tickets mailing list