[Numpy-discussion] Special matrices with structure?

Dag Sverre Seljebotn d.s.seljebotn@astro.uio...
Thu Feb 23 11:49:06 CST 2012


On 02/23/2012 09:47 AM, Dag Sverre Seljebotn wrote:
> On 02/23/2012 05:50 AM, Jaakko Luttinen wrote:
>> Hi!
>>
>> I was wondering whether it would be easy/possible/reasonable to have
>> classes for arrays that have special structure in order to use less
>> memory and speed up some computations?
>>
>> For instance:
>> - symmetric matrix could be stored in almost half the memory required by
>> a non-symmetric matrix
>> - diagonal matrix only needs to store the diagonal vector
>> - Toeplitz matrix only needs to store one or two vectors
>> - sparse matrix only needs to store non-zero elements (some
>> implementations in scipy.sparse)
>> - and so on
>>
>> If such classes were implemented, it would be nice if they worked with
>> numpy functions (dot, diag, ...) and operations (+, *, +=, ...) easily.
>>
>> I believe this has been discussed before but google didn't help a lot..
>
> I'm currently working on a library for this. The catch is that that I'm
> doing it as a work project, not a hobby project -- so only the features
> I strictly need for my PhD thesis really gets priority. That means that
> it will only really be developed for use on clusters/MPI, not so much
> for single-node LAPACK.
>
> I'd love to pair up with someone who could make sure the library is more
> generally useful, which is my real goal (if I ever get spare time
> again...).
>
> The general idea of my approach is to have lazily evaluated expressions:
>
> A = # ... diagonal matrix
> B = # ... dense matrix
>
> L = (give(A) + give(B)).cholesky() # only "symbolic"!
> # give means: overwrite if you want to
>
> explain(L) # prints what it will do if it computes L
> L = compute(L) # does the computation
>
> What the code above would do is:
>
> - First, determine that the fastest way of doing + is to take the
> elements in A and += them inplace to the diagonal in B
> - Then, do the Cholesky in

Sorry: Then, do the Cholesky inplace in the buffer of B, and use that for L.

Dag

>
> Note that if you change the types of. The goal is to facilitate writing
> general code which doesn't know the types of the matrices, yet still
> string together the optimal chain of calls. This requires waiting with
> evaluation until an explicit compute call (which essentially does a
> "compilation").
>
> Adding matrix types and operations is done through pattern matching.
> This one can provide code like this to provide optimized code for wierd
> special cases:
>
> @computation(RowMajorDense + ColMajorDense, RowMajorDense)
> def add(a, b):
> # provide an optimized case for row-major + col-major, resulting
> # in row-major
>
> @cost(add)
> def add_cost(a, b):
> # provide estimate for cost of the above routine
>
> The compiler looks at all the provided @computation and should
> determines the cheapest path.
>
> My code is at https://github.com/dagss/oomatrix, but I certainly haven't
> done anything yet to make the codebase useful to anyone but me, so you
> probably shouldn't look at it, but rather ask me here.
>
> Dag



More information about the NumPy-Discussion mailing list