[Scipy-tickets] [SciPy] #1328: scipy.spatial.distance.sqeuclidean broken: Invalid results

SciPy Trac scipy-tickets@scipy....
Sun Jun 12 11:07:16 CDT 2011

#1328: scipy.spatial.distance.sqeuclidean broken: Invalid results
 Reporter:  stefan_r                                                                |       Owner:  peridot     
     Type:  defect                                                                  |      Status:  needs_review
 Priority:  normal                                                                  |   Milestone:  0.10.0      
Component:  scipy.spatial                                                           |     Version:  0.8.0       
 Keywords:  squared euclidean distance,distance,scipy.spatial.distance.sqeuclidean  |  

Comment(by rgommers):

 The implementation is a bit messy, but the basic point is that 2-D inputs
 are simply not allowed for all the distance metrics. This is very clear
 from both the docstrings and the implementation (many metrics use sum(),
 so produce a single number). Results of euclidean/squeuclidean are correct
 for 1-D input.

 pdist does allow 2-D inputs but calls a C implementation in
 _distance_wrap.so, not the Python one. Unless you supply a callable for
 the "metric" keyword in pdist, but this is probably broken. Ticket #810
 proposes to fix this inconsistency, but that's quite some work.

 I already wrote the tests for what I think should happen for 2-D inputs
 for now:
 def test_euclideans():
     """Regression test for ticket #1328."""
     x1 = [1, 1, 1]
     x2 = [0, 0, 0]
     assert_almost_equal(sqeuclidean(x1, x2), 3.0, decimal=14)
     assert_almost_equal(euclidean(x1, x2), np.sqrt(3), decimal=14)
     # Check flattening for (1, N) or (N, 1) inputs
     assert_almost_equal(euclidean(x1[np.newaxis, :], x2[np.newaxis, :]),
                         np.sqrt(3), decimal=14)
     assert_almost_equal(sqeuclidean(x1[np.newaxis, :], x2[np.newaxis, :]),
                         3.0, decimal=14)
     assert_almost_equal(sqeuclidean(x1[:, np.newaxis], x2[:, np.newaxis]),
                         3.0, decimal=14)

     # Distance metrics only defined for vectors (= 1-D)
     x = np.arange(4).reshape(2, 2)
     assert_raises(ValueError, euclidean, x, x)
     assert_raises(ValueError, sqeuclidean, x, x)

     rs = np.random.RandomState(1234567890)
     x = rs.random(10)
     y = rs.random(10)
     d1 = euclidean(x, y)
     d2 = sqeuclidean(x, y)
     assert_array_almost_equal(d1**2, d2, decimal=14)

Ticket URL: <http://projects.scipy.org/scipy/ticket/1328#comment:10>
SciPy <http://www.scipy.org>
SciPy is open-source software for mathematics, science, and engineering.

More information about the Scipy-tickets mailing list