[SciPy-User] Sparse vector
Thu Apr 15 17:18:02 CDT 2010
On 15 April 2010 17:29, Felix Schlesinger <firstname.lastname@example.org> wrote:
> I was wondering what peoples recommendations and thoughts are on
> sparse vectors, i.e. long vectors where most entries are 0.
> 1-D numpy arrays waste a lot of memory in that case. Python
> defaultdicts still use more memory then should be needed (since they
> store python objects) and do not work well for numpy math operations
> and slicing. Scipy.sparse has several implementations for sparse 2D
> matrices which could be used for vectors, but that does not seem ideal
> for clarity, efficiency and function broadcasting. Is there something
> else out there or am I maybe missing a simple way to do it
> In my particular case the vectors would be write-once, read-often and
> maybe about 1% filled with integers. They are small enough to fit into
> memory in dense form one at a time during construction.
The short answer is, no, there's no support for such a thing in numpy/scipy.
There's no way to make such a thing under-the-hood compatible with
numpy arrays, since they require evenly-strided memory. And scipy's
sparse matrices are built on the assumption that an n by n matrix will
have at least O(n) nonzero elements, so you are going to have to watch
carefully what you do with your sparse vectors. That said, dok
matrices should be all right (though no more efficient than
defaultdicts) and one of csr/csc matrices will be efficient, depending
on whether you view your vectors as row or column matrices.
I occasionally wonder whether a generalized sparse ndarray object
would be useful. You'd want it to specify the value for empty elements
so that things like boolean arrays could be done this way, and I think
a dictionary of keys approach would be the way to go. In any case,
nothing exists now.
> SciPy-User mailing list
More information about the SciPy-User