# [SciPy-user] sparse matrices - list assignment to rows and columns

josef.pktd@gmai... josef.pktd@gmai...
Thu Apr 9 08:36:30 CDT 2009

```On Thu, Apr 9, 2009 at 5:17 AM, Gergely Imreh <imrehg@gmail.com> wrote:
> 2009/4/9  <josef.pktd@gmail.com>:
>> On Thu, Apr 9, 2009 at 4:51 AM,  <josef.pktd@gmail.com> wrote:
>>> On Wed, Apr 8, 2009 at 11:51 PM, Gergely Imreh <imrehg@gmail.com> wrote:
>>>> Hi,
>>>>
>>>>  I was trying figure out the scipy sparse matrix handling, but run
>>>> into some difficulties assigning a list of values to rows and columns.
>>>>  The scipy tutorial has the following example [1]:
>>>>
>>>> from scipy import sparse
>>>> Asp = sparse.lil_matrix((50000,50000))
>>>> Asp.setdiag(ones(50000))
>>>> Asp[20,100:250] = 10*random.rand(150)
>>>> Asp[200:250,30] = 10*random.rand(50)
>>>>
>>>>  That looks straightforward enough, make a large, diagonal sparse
>>>> matrix, and set some additional elements to non-zero. What I get,
>>>> however, is different:
>>>> Asp[20,100:250] = 10*random.rand(150)  sets the matrix elements at row
>>>> 20, column 100-249 to random values.
>>>> Asp[200:250,30] = 10*random.rand(50) sets the matrix element at row
>>>> 200, column 30 to a 50 element row vector with random values....
>>>> (elements at row 201-249, column 30 are still 0)
>>>>  If I reshape the results of random.rand(50) to be in a column
>>>> instead of row, the assignment will results in the elements of row
>>>> 200-249, column 30 to be set to a single element array values (So, for
>>>> exaple Asp[200,30] will be an array, which will have a single random
>>>> value at  [0,0])
>>>>
>>>>  I'm using Python 2.6 (that comes with my distro, or 2.4 for which
>>>> I'd have to recompile scipy) and scipy 0.7.0. Is this kind of
>>>> behaviour due to the changes (and incompatibilites) of 2.6 (since I
>>>> know scipy is writtend to be compatible up to 2.5) or something else?
>>>> The other sparse matrix types would handle this differently?
>>>>  A workaround is to do single element assignments but I'd think
>>>> that's probably slower in general.
>>>>
>>>>  Cheers!
>>>>      Greg
>>>> [1] http://www.scipy.org/SciPy_Tutorial
>>>
>>>
>>> There is an assignment error:
>>> Asp[200:250,30]  seems to assign all 50 elements to to the position Asp[200,30]
>>>
>>>>>> Asp[200,30]
>>> <1x50 sparse matrix of type '<type 'numpy.float64'>'
>>>        with 50 stored elements in LInked List format>
>>>>>> Aspc = Asp.tocrc()
>>> Traceback (most recent call last):
>>>  File "<pyshell#4>", line 1, in <module>
>>>    Aspc = Asp.tocrc()
>>>  File "c:\josef\_progs_scipy\scipy\sparse\base.py", line 429, in __getattr__
>>>    raise AttributeError, attr + " not found"
>>
>> sorry, I copied the wrong traceback, it should be:
>>
>>>>> Aspc = Asp.tocsr()
>> Traceback (most recent call last):
>>  File "<pyshell#7>", line 1, in <module>
>>    Aspc = Asp.tocsr()
>>  File "c:\josef\_progs_scipy\scipy\sparse\lil.py", line 427, in tocsr
>>    data = np.asarray(data, dtype=self.dtype)
>>  File "C:\Programs\Python25\Lib\site-packages\numpy\core\numeric.py",
>> line 230, in asarray
>>    return array(a, dtype, copy=False, order=order)
>> ValueError: setting an array element with a sequence.
>>
>>>
>>> this is with
>>>>>> scipy.version.version
>>> '0.8.0.dev5551'
>>>
>>> there is a related assignment error that got fixed in trunk,
>>> I don't know if it also handles this case, a bug report might be
>>> useful to make sure this case is handled correctly
>>>
>>> I think, for this example dok format would be better to build the
>>> matrix, since column slices need to access many lists
>>>
>>> Asp = sparse.dok_matrix((50000,50000))
>>> Aspr = Asp.tocsr()
>>>
>>> works without problems
>>>
>>> I checked the history of the scipy tutorial that you linked to, the
>>> main editing has been done in 2006, and maybe it isn't up to date.
>>>
>>> The current docs are being written and are available at
>>> http://docs.scipy.org/doc/
>>>
>>> Josef
>>>
>
>
> Yes, I think is the same, I got a ValueError as well, having upgraded
> to the latest (r5655) version.
>
> Traceback (most recent call last):
>  File "sp2.py", line 6, in <module>
>    Asp[200:250,30] = 10*random.rand(50)
>  File "/usr/lib/python2.6/site-packages/scipy/sparse/lil.py", line
> 329, in __setitem__
>    self._insertat3(row, data, j, xx)
>  File "/usr/lib/python2.6/site-packages/scipy/sparse/lil.py", line
> 285, in _insertat3
>    self._insertat2(row, data, j, x)
>  File "/usr/lib/python2.6/site-packages/scipy/sparse/lil.py", line
> 246, in _insertat2
>    raise ValueError('setting an array element with a sequence')
> ValueError: setting an array element with a sequence
>
> Checked out the new documentation you referenced[1] and there is only
> same-row assignment (e.g. A[0, :100] = rand(100)  ) but no same-column
> assignment...
>
> So still, my question is that is there something inherently different
> between array -> row and array -> column assigment in this case?

I'm not completely sure about the internal structure, but essentially
the non-zero values are stored in row-wise lists, to access a column
slice it needs to access all the rowlists and insert to each, and this
is much slower.

see the explanation in the documentation describing the different formats.

Josef
```