# [Numpy-discussion] untenable matrix behavior in SVN

Timothy Hochberg tim.hochberg@ieee....
Wed Apr 30 23:32:01 CDT 2008

```On Wed, Apr 30, 2008 at 8:16 PM, Anne Archibald <peridot.faceted@gmail.com>
wrote:

> 2008/4/30 Charles R Harris <charlesr.harris@gmail.com>:
>
> > Some operations on stacks of small matrices are easy to get, for
> instance,
> > +,-,*,/, and matrix multiply. The last is the interesting one. If A and B
> > are stacks of matrices with the same number of dimensions with the
> matrices
> > stored in the last two indices, then
> >
> > sum(A[...,:,:,newaxis]*B[...,newaxis,:,:], axis=-2)
> >
> > is the matrix-wise multiplication of the two stacks. If B is replaced by
> a
> > stack of 1D vectors, x, it is even simpler:
> >
> > sum(A[...,:,:]*x[...,newaxis,:], axis=-1)
> >
> > This doesn't go through BLAS, but for large stacks of small matrices it
> > might be even faster than BLAS because BLAS is kinda slow for small
> > matrices.
>
> Yes and no. For the first operation, you have to create a temporary
> that is larger than either of the two input arrays. These invisible
> (potentially) gigantic temporaries are the sort of thing that puzzle
> users when as their problem size grows they suddenly find they hit a
> massive slowdown because it starts swapping to disk, and then a
> failure because the temporary can't be allocated. This is one reason
> we have dot() and tensordot() even though they can be expressed like
> this. (The other is of course that it lets us use optimized BLAS.)
>
>
> This rather misses the point of Timothy Hochberg's suggestion (as I
> understood it), though: yes, you can write the basic operations in
> numpy, in a more or less efficient fashion. But it would be very
> valuable for arrays to have some kind of metadata that let them keep
> track of which dimensions represented simple array storage and which
> represented components of a linear algebra object. Such metadata could
> make it possible to use, say, dot() as if it were a binary ufunc
> taking two matrices. That is, you could feed it two arrays of
> matrices, which it would broadcast to the same shape if necessary, and
> then it would compute the elementwise matrix product.
>
> The question I have is, what is the right mathematical model for
> describing these
> arrays-some-of-whose-dimensions-represent-linear-algebra-objects?
>
>
> One idea is for each dimension to be flagged as one of "replication",
> "vector", or "covector". A column vector might then be a rank-1 vector
> array, a row vector might be a rank-1 covector array, a linear
> operator might be a rank-2 object with one covector and one vector
> dimension, a bilinear form might be a rank-2 object with two covector
> dimensions. Dimensions designed for holding repetitions would be
> flagged as such, so that (for example) an image might be an array of
> shape (N,M,3) of types ("replication","replication","vector"); then to
> apply a color-conversion matrix one would simply use dot() (or "*" I
> suppose). without too much concern for which index was which. The
> problem is, while this formalism sounds good to me, with a background
> in differential geometry, if you only ever work in spaces with a
> canonical metric, the distinction between vector and covector may seem
>
> Implementing such a thing need not be too difficult: start with a new
> subclass of ndarray which keeps a tuple of dimension types. Come up
> with an adequate set of operations on them, and implement them in
> terms of numpy's functions, taking advantage of the extra information
> about each axis. A few operations that spring to mind:
>
> * Addition: it doesn't make sense to add vectors and covectors; raise
> an exception. Otherwise addition is always elementwise anyway. (How
> hard should addition work to match up corresponding dimensions?)
> * Multiplication: elementwise across "repetition" axes, it combines
> vector axes with corresponding covector axes to get some kind of
> generalized matrix product. (How is "corresponding" defined?)
> * Division: mostly doesn't make sense unless you have an array of
> scalars (I suppose it could compute matrix inverses?)
> * Exponentiation: very limited (though I suppose matrix powers could
> be implemented if the shapes are right)
> * Change of basis: this one is tricky because not all dimensions need
> come from the same vector space
> * Broadcasting: the rules may become a bit byzantine...
>
> Is this a useful abstraction? It seems like one might run into trouble
> when dealing with arrays whose dimensions represent vectors from
> unrelated spaces.

Thanks Anne. That is exactly what I had in mind. Alas, every time I sit down
to try to prototype some code, it collapses under its own weight. I'm
becoming warmer to an extended version of the row/col/matrix idea just
because its simpler to understand. It would be just as you describe above
but would support only four cases:

1. scalar: all axes are 'replication' axes
2. vector: last axes is a 'vector' all other replication.
3. covector: last axes is a 'covector' all other replication
4. matrix: last two axes are covector and vector respectively. Others are
replication.

Or something like that. It's basically the row/col/matrix formulation with
stacking. I suspect in practice it gives us most of the power of the full
proposal with less complexity (for the user anyway). Then again, if the
implementation turns out not to be that bad, one could always implement some
convenience types on top of it to make it usable for the masses with the
fully general version available for the stout of heart.

Regards,

--
. __
. |-\
.
. tim.hochberg@ieee.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://projects.scipy.org/pipermail/numpy-discussion/attachments/20080430/235c5f24/attachment-0001.html
```