[SciPy-user] repmat of matrix returns array
wbaxter at gmail.com
Thu Apr 27 21:27:05 CDT 2006
Just one quick addendum that occurred to me...
On 4/28/06, Bill Baxter <wbaxter at gmail.com> wrote:
> ... Basically from memory it's this (in a hodgepodge of notations):
> Let A be rank-N with shape (s1, ..., sN)
> Let B be rank-M with shape (t1, ..., tM)
> For the tensor product A * B to exist, we must have sN==t1
> Let i1,...,iN be indices for A, and j1,...,jN be indices for B.
> C[i1,..., iN-1, j2, ..., jM] = dot( A[i1, ..., iN-1, :],
> B[: , j2, ..., jM] )
> Where the ...'s are mathematical ...'s --- meaning I left the middle part
> out -- not python syntax.
> I'd have to think about it some more, but I think this fundamental
> definition from tensor algebra automatically covers all the cases enumerated
> in the above docstring, including reurning shape (N,1) for (N,M)*(M,1)
> versus just (N) for (N,M)*(M). Oh, there's one more twist needed to cover
> transposes. So actually it should be allowable to do (M,N)*(M,P). It just
> means A.transpose() * B.
Actually, thinking about it more, there are plenty more variations on the
tensor product for higher rank tensors.
For instance with a.shape==(N,N), and b.shape==(N,N), the dummy indices can
appear pretty much anywhere and there can be any number of them. Some
examples from math:
1) c_ik = a_ij * b_jk --- normal matrix multiply a * b
2) c_ik = a_ji * b_jk --- transpose multiply a^t * b
3) c = a_ij * b_ij --- element-wise product and sum of all
elements (aka trace of a^t * b)
4) c_ijkl = a_ij * b_kl --- forgot what to call this, but it creates a
rank-4 tensor from two rank 2's.
5) c_kilj = a_ij * b_kl --- same as above, just the axes in the result
are swapped around
So what I'm thinking is the first behavior should be assumed by the '
linalg.mult()' function. That is, assume there's one dummy index, and that
it's on the last axis of a, and the first axis of b, and that the result
comes from tacking the remaining non-dummy axes of a and b together in order
(so no transposes on the result)
However it would be nice to be able to use the full generality of tensor
index notation to specify tensor products when wanted. I can think of a
couple of ways you might want to specify the dummies.
First, as a pair of axes (or list of pairs). So mult(a, b, (n1,n2)) would
mean a.shape[n1] and b.shape[n2] share a dummy index that is summed over.
Trying it out on the above examples would look like this:
1) mult(a, b, (1,0))
or maybe also mult(a,b, (-1,0)), could be used, meaning the last
axis of a, and the first axis of b. This could be the default value of the
2) mult(a,b, (0,0) )
3) mult(a,b, [(0,0), (1,1)])
4) mult(a,b, )
5) (not possible with this notation, but could be done as mult(a,b,)
followed by some swapaxes calls.)
The other option is to allow something like the mathematical notation to be
used. This would require a little parsing, but it would definitely be
handy, I think.
1) mult(a,b, "ij*jk")
2) mult(a,b, "ji*jk")
3) mult(a,b, "ij*ij")
4) mult(a,b, "ij*kl")
5) mult(a,b, "kilj=ij*kl")
The first option would be easier to use if you just want something like "use
dummies for the last and second to last dimensions". Then you could just
say [(-1,-1),(-2,-2)] and it will work for tensors of any rank >= 2.
The latter option would be easier if you're working with known, fixed,
smallish-rank entities. Plus it would be much easier to use if you're just
transcribing some tensor math from your notebook into numpy.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the SciPy-user