[Scipy-tickets] [SciPy] #1328: scipy.spatial.distance.sqeuclidean broken: Invalid results
SciPy Trac
scipy-tickets@scipy....
Sun Jun 12 11:07:16 CDT 2011
#1328: scipy.spatial.distance.sqeuclidean broken: Invalid results
------------------------------------------------------------------------------------+
Reporter: stefan_r | Owner: peridot
Type: defect | Status: needs_review
Priority: normal | Milestone: 0.10.0
Component: scipy.spatial | Version: 0.8.0
Keywords: squared euclidean distance,distance,scipy.spatial.distance.sqeuclidean |
------------------------------------------------------------------------------------+
Comment(by rgommers):
The implementation is a bit messy, but the basic point is that 2-D inputs
are simply not allowed for all the distance metrics. This is very clear
from both the docstrings and the implementation (many metrics use sum(),
so produce a single number). Results of euclidean/squeuclidean are correct
for 1-D input.
pdist does allow 2-D inputs but calls a C implementation in
_distance_wrap.so, not the Python one. Unless you supply a callable for
the "metric" keyword in pdist, but this is probably broken. Ticket #810
proposes to fix this inconsistency, but that's quite some work.
I already wrote the tests for what I think should happen for 2-D inputs
for now:
{{{
def test_euclideans():
"""Regression test for ticket #1328."""
x1 = [1, 1, 1]
x2 = [0, 0, 0]
assert_almost_equal(sqeuclidean(x1, x2), 3.0, decimal=14)
assert_almost_equal(euclidean(x1, x2), np.sqrt(3), decimal=14)
# Check flattening for (1, N) or (N, 1) inputs
assert_almost_equal(euclidean(x1[np.newaxis, :], x2[np.newaxis, :]),
np.sqrt(3), decimal=14)
assert_almost_equal(sqeuclidean(x1[np.newaxis, :], x2[np.newaxis, :]),
3.0, decimal=14)
assert_almost_equal(sqeuclidean(x1[:, np.newaxis], x2[:, np.newaxis]),
3.0, decimal=14)
# Distance metrics only defined for vectors (= 1-D)
x = np.arange(4).reshape(2, 2)
assert_raises(ValueError, euclidean, x, x)
assert_raises(ValueError, sqeuclidean, x, x)
rs = np.random.RandomState(1234567890)
x = rs.random(10)
y = rs.random(10)
d1 = euclidean(x, y)
d2 = sqeuclidean(x, y)
assert_array_almost_equal(d1**2, d2, decimal=14)
}}}
--
Ticket URL: <http://projects.scipy.org/scipy/ticket/1328#comment:10>
SciPy <http://www.scipy.org>
SciPy is open-source software for mathematics, science, and engineering.
More information about the Scipy-tickets
mailing list