[SciPy-Dev] Indexing sparse matrices
Bruce Southey
bsouthey@gmail....
Thu Jun 24 09:39:30 CDT 2010
On 06/23/2010 11:43 PM, Jason Grout wrote:
> On 6/23/10 12:31 PM, Bruce Southey wrote:
>
>> On 06/23/2010 09:22 AM, Jason Grout wrote:
>>
>>
>>> Hi everyone,
>>>
>>> This bug in scipy sparse matrices was recently reported in Sage:
>>>
>>> ----------------------------------------------------------------------
>>> | Sage Version 4.4.2, Release Date: 2010-05-19 |
>>> | Type notebook() for the GUI, and license() for information. |
>>> ----------------------------------------------------------------------
>>> sage: from scipy import sparse
>>> sage: a = sparse.lil_matrix((10,10))
>>> sage: a[1,2] = 1
>>> ---------------------------------------------------------------------------
>>> ValueError Traceback (most recent call last)
>>>
>>> /Users/grout/sage-4.4.2-test3/local/lib/python2.6/site-packages/numpy/core/<ipython
>>> console> in<module>()
>>>
>>> /Users/grout/sage/local/lib/python2.6/site-packages/scipy/sparse/lil.pyc
>>> in __setitem__(self, index, x)
>>> 320 self._insertat3(row, data, j, xx)
>>> 321 else:
>>> --> 322 raise ValueError('invalid index value: %s' % str((i,
>>> j)))
>>> 323
>>> 324 def _mul_scalar(self, other):
>>>
>>> ValueError: invalid index value: (1, 2)
>>>
>>>
>>> Since the preparser in Sage makes integers instances of the Sage Integer
>>> class, and the Sage Integer class is not in the specific list of types
>>> checked by numpy.isscalar, the case falls through the checks in the
>>> sparse matrix setitem method to the last error-raising case. The problem
>>> seems to be that scipy checks for a specific list of types (via
>>> numpy.isscalar), instead of just using the __index__ magic method which
>>> was designed for this purpose (by Travis Oliphant for numpy! see
>>> http://docs.python.org/whatsnew/2.5.html#pep-357-the-index-method).
>>>
>>> Note that things work fine in numpy:
>>>
>>> sage: import numpy
>>> sage: a=numpy.array([[1,2],[3,4]],dtype=int); a
>>> array([[1, 2],
>>> [3, 4]])
>>> sage: a[1,1]=5
>>> sage: a
>>> array([[1, 2],
>>> [3, 5]])
>>>
>>>
>>> Thanks,
>>>
>>> Jason
>>>
>>>
>>> _______________________________________________
>>> SciPy-Dev mailing list
>>> SciPy-Dev@scipy.org
>>> http://mail.scipy.org/mailman/listinfo/scipy-dev
>>>
>>>
>>>
>> Please post this discussion on the numpy list from the scipy-dev list
>> because this is not really scipy issue (yet).
>>
>>
This is not the place for this discussion!
>>
>>
> The thing is that the index object may not be a scalar, but it may have
> an __index__ method that is supposed to be used when the object is used
> as an index. So the call to isscalar may rightfully return False, but
> the __index__ method should still be called to see if there is a value
> to act as an index. For example, according to Python>=2.5, this should
> work:
>
Remember (as given below) that numpy officially supports Python 2.4 so
any Python 2.5+ features are not allowed (at least until some one
notices them).
If you just look at the numpy's isscalar function you can see why it
fails - so file a bug report on it.
> Python 2.6.4 (r264:75706, May 25 2010, 15:42:09)
> Type "copyright", "credits" or "license" for more information.
>
> IPython 0.9.1 -- An enhanced Interactive Python.
> ? -> Introduction and overview of IPython's features.
> %quickref -> Quick reference.
> help -> Python's own help system.
> object? -> Details about 'object'. ?object also works, ?? prints more.
>
> In [1]: class MyObject(object):
> ...: def __index__(self):
> ...: return 3
> ...:
> ...:
>
> In [2]: import numpy
>
> In [3]: a=numpy.array([[1,2,3,4],[3,2,1,2]])
>
> In [4]: ind=MyObject()
>
> In [5]: a[1,ind]
> Out[5]: 2
>
> ***************************
> Note: it works fine in numpy
> ***************************
>
>
> In [6]: from scipy import sparse
>
> In [7]: a = sparse.lil_matrix((10,10))
>
> In [8]: a[1,ind]=3
> ---------------------------------------------------------------------------
> ValueError Traceback (most recent call last)
>
> /Users/grout/projects/reptheory/practice/<ipython console> in<module>()
>
> /Users/grout/sage/local/lib/python2.6/site-packages/scipy/sparse/lil.pyc
> in __setitem__(self, index, x)
> 302 row = self.rows[i]
> 303 data = self.data[i]
> --> 304 self._insertat3(row, data, j, x)
> 305 elif issequence(i) and issequence(j):
> 306 if np.isscalar(x):
>
> /Users/grout/sage/local/lib/python2.6/site-packages/scipy/sparse/lil.pyc
> in _insertat3(self, row, data, j, x)
> 276 self._insertat2(row, data, j, x)
> 277 else:
> --> 278 raise ValueError('invalid column value: %s' % str(j))
> 279
> 280
>
> ValueError: invalid column value:<__main__.MyObject object at 0x1017a9f90>
>
>
> **************************
> Note that the problem above is that only the first index is checked to
> be a scalar, instead of all indices. Continuing on...
>
It is meaningless to compare an ndarray with a sparse matrix because
these are not the same thing (IIRC sparse matrices are not a subclass of
ndarrays).
> **************************
>
>
> In [9]: a.__setitem__??
>
> In [10]: numpy.isscalar(ind)
> Out[10]: False
>
Correct since ind should be a class instance.
>>> ind
<__main__.MyObject object at 0x11e4950>
> In [11]: a.__setitem__??
>
> In [12]: numpy.isscalar(ind)
> Out[12]: False
>
> In [13]: a[ind,1]=3
> ---------------------------------------------------------------------------
> ValueError Traceback (most recent call last)
>
> /Users/grout/projects/reptheory/practice/<ipython console> in<module>()
>
> /Users/grout/sage/local/lib/python2.6/site-packages/scipy/sparse/lil.pyc
> in __setitem__(self, index, x)
> 320 self._insertat3(row, data, j, xx)
> 321 else:
> --> 322 raise ValueError('invalid index value: %s' % str((i,
> j)))
> 323
> 324 def _mul_scalar(self, other):
>
> ValueError: invalid index value: (<__main__.MyObject object at
> 0x1017a9f90>, 1)
>
> **********
> Here, the __index__ method of ind was not called. Note that ind itself
> may actually be something we'd normally consider to be not a scalar, so
> again, I don't think the isscalar check is really what needs to happen.
> Rather, if scipy wants an index, it should check for an index specifically.
>> I agree that this appears to be an issue between Sage and numpy's
>> isscalar function. There are many uses of isscalar() in numpy (by
>> grepping the source) and scipy that should be failing with Sage. For
>> example you should be getting errors using numpy's polynomial functions
>> (polynomial/polynomial.py). I would have thought that you would be
>> seeing lots of errors when running numpy and scipy tests within Sage.
>> (Perhaps those tests don't occur with Sage integers because Sage does
>> not create them.)
>>
>>
> We used to have lots of errors before using the standard numpy ways for
> converting Sage datatypes to numpy datatypes. We should look at these
> other situations when isscalar might be called, though. I do believe
> the above issue is a different issue, since scipy specifically wants
> indices, which don't always correspond with something being a "scalar".
>
>> Also, note that numpy still supports Python 2.4.
>>
>>
> Good point. I don't know if it's better to test the version number of
> python before trying to use __index__, or to just assume that python 2.4
> code may still use __index__ like we expect in 2.5.
>
> Thanks,
>
> Jason
>
>
I am not an expert on numpy but I do think you confusing how you think
it should work and how it actually works.
The isscalar function is a very simple function that checks if the input
type is in np.generic or in np.ScalarType (I think these are all in
numpy/core/numerictypes.py). Thus, isscalar is really only checking if
the input is a valid dtype as those must be scalar - hence the wikipedia
definition of scalar:
en.wikipedia.org/wiki/Scalar_(computing)
Thus not even a list is a numpy 'scalar'
>>> np.isscalar([3.1])
False
Changing that numerictypes.py file would be an highly undesirable ABI
change. The solution, which should help Sage, is to adapt isscalar to
allow similar things to Sage's scalars. But I am not sure how check that
a class/function returns a valid dtype scalar - I am not even sure if
the test 'isinstance(num, generic)' is even needed.
So you need to report a bug for isscalar that includes a non-Sage
example. It would be great if you could also provide a patch as well.
Bruce
More information about the SciPy-Dev
mailing list