[SciPy-dev] Possible Error in Kendall's Tau (scipy.stats.stats.kendalltau)

josef.pktd@gmai... josef.pktd@gmai...
Wed Mar 18 22:07:15 CDT 2009


> Kendall tau-c (alternative tie handling):
> -----------------------------------------
> (also called Stuart's tau-c or Kendall-Stuart's tau-c)
>        t = (m * (P - Q)) / (n^2 * (m - 1))
> where P is the number of concordant pairs, Q the number of discordant
> pairs, n the number of items and m = min(r,s) where r and s are the
> number of rows and columns in the data.
>
> [Note that there are some incorrect definition of Kendall tau-c floating
>  around which substitute 2m instead of m in the numerator, as this
>  can yield values outside of the (-1, +1) range this is obviously wrong]
>

Just a final comment:
I think there are also two different definitions of pairs in usage,
whether each pair is counted twice, e.g. a is compared with b and b is
compared with a

If both directions are counted as separate pairs, then there are
n*(n-1) pairs (this is what I use), otherwise there are n*(n-1)/2
pairs. For tau-a and tau-b it doesn't matter as long as the same
definition is used in the numerator and in the denominator.
For tau-c, I get the same result as in the spss example, however in
the explanation for spss, they use 2*m while I use m, but they have
only half the number of pairs that I do, which exactly compensates.

Josef


More information about the Scipy-dev mailing list