[Numpy-discussion] Jaccard & Hamming Problem

josef.pktd@gmai... josef.pktd@gmai...
Thu Mar 1 19:13:03 CST 2012


On Thu, Mar 1, 2012 at 10:10 AM, Zayd YAKOUBI <zayd.yakoubi@gmail.com> wrote:
> thank you very much,
> In fact, the functions of these two measures are  for binary vectors, and I
> have not found their extension to real data such as: 0.7, 0.9, 1.7 ....
> Knowing that I applied to this data and it worked well..
>
> Have an idea about the version of these functions for this type of data ?

for hamming

just guessing : 1 - np.mean(x==y)  which might depend on the implementation

>>> spatial.distance.hamming([0,0.5,1,2], np.ones(4))
0.75
>>> 1 - np.mean([0,0.5,1,2] == np.ones(4))
0.75
>>> spatial.distance.hamming([0,0.5,1,1], np.ones(4))
0.5
>>> 1 - np.mean([0,0.5,1,1] == np.ones(4))
0.5

However I wouldn't trust it for floating point numbers, unless you are
sure about the floating point representation

>>> [0,0.5,3,2], [0,0.5,np.sqrt(3)**2,2]
([0, 0.5, 3, 2], [0, 0.5, 2.9999999999999996, 2])
>>> spatial.distance.hamming([0,0.5,3,2], [0,0.5,np.sqrt(3)**2,2])
0.25
>>> spatial.distance.hamming([0,0.5,3,2], [0,0.5,3,2])
0.0

Josef

>
> thank you for your help
> Saisissez du texte, l'adresse d'un site Web ou importez un document à
> traduire.
> Annuler
> Alpha
>
> Regards,
> Zayd
>
>
>
>
> 2012/3/1 Warren Weckesser <warren.weckesser@enthought.com>
>>
>>
>>
>> On Thu, Mar 1, 2012 at 8:43 AM, Zayd YAKOUBI <zayd.yakoubi@gmail.com>
>> wrote:
>>>
>>> Hello,
>>>
>>> I use the similarity measure "Jaccard" and "Hamming" of pckage
>>> Scipy.spacial.cdist (Python) in a clustering context, I applied to given
>>> typs of real and integer (0.6 0.2 1.7 May 8 ). They gave good results. But I
>>> just know that they normally only applies to binary data. The function of
>>> these two similarity measures are not specified in the documentation:
>>> http://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.cdist.html.
>>> Does anyone of you can help me find these functions?
>>> Thank you in advance
>>>
>>
>>
>>
>> http://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.hamming.html#scipy.spatial.distance.hamming
>>
>>
>> http://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.jaccard.html#scipy.spatial.distance.jaccard
>>
>>
>> Those are the nicely formatted versions of the docstrings of functions.
>> You can also access these in an interactive shell, e.g.
>>
>> >>> from scipy.spatial.distance import hamming
>> >>> help(hamming)
>>
>> or in ipython
>>
>> In [1] from scipy.spatial.distance import hamming
>> In [2] hamming?
>>
>>
>> Warren
>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


More information about the NumPy-Discussion mailing list