[SciPy-User] Sparse vector
josef.pktd@gmai...
josef.pktd@gmai...
Thu Apr 15 17:29:47 CDT 2010
On Thu, Apr 15, 2010 at 6:18 PM, Anne Archibald
<peridot.faceted@gmail.com> wrote:
> On 15 April 2010 17:29, Felix Schlesinger <schlesin@cshl.edu> wrote:
>> Hello,
>>
>> I was wondering what peoples recommendations and thoughts are on
>> sparse vectors, i.e. long vectors where most entries are 0.
>> 1-D numpy arrays waste a lot of memory in that case. Python
>> defaultdicts still use more memory then should be needed (since they
>> store python objects) and do not work well for numpy math operations
>> and slicing. Scipy.sparse has several implementations for sparse 2D
>> matrices which could be used for vectors, but that does not seem ideal
>> for clarity, efficiency and function broadcasting. Is there something
>> else out there or am I maybe missing a simple way to do it
>> efficiently?
>> In my particular case the vectors would be write-once, read-often and
>> maybe about 1% filled with integers. They are small enough to fit into
>> memory in dense form one at a time during construction.
>
> The short answer is, no, there's no support for such a thing in numpy/scipy.
>
> There's no way to make such a thing under-the-hood compatible with
> numpy arrays, since they require evenly-strided memory. And scipy's
> sparse matrices are built on the assumption that an n by n matrix will
> have at least O(n) nonzero elements, so you are going to have to watch
> carefully what you do with your sparse vectors. That said, dok
> matrices should be all right (though no more efficient than
> defaultdicts) and one of csr/csc matrices will be efficient, depending
> on whether you view your vectors as row or column matrices.
>
> I occasionally wonder whether a generalized sparse ndarray object
> would be useful. You'd want it to specify the value for empty elements
> so that things like boolean arrays could be done this way, and I think
> a dictionary of keys approach would be the way to go. In any case,
> nothing exists now.
for some applications, I keep just the nonzero values in an array and
the index separate, which allows easy back and forth conversion to
dense.
But I don't really use it as substitute for sparse, just so that I
have a convenient representation e.g. for optimization and for easier
input.
Josef
>
> Anne
>
>> Thanks
>> Felix
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User@scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
More information about the SciPy-User
mailing list