[SciPy-user] numarray interface and performance issues (for dot product and transpose)
Fernando Perez
fperez at pizero.colorado.edu
Fri Mar 1 01:48:58 CST 2002
A few (spotty) comments on your excellent and detailed report.
> scipy and other projects are bustling with activity and, as far as I
> know, there is currently a promising undergraduate project on the way
> at my university to provide powerful plotting facilities for python).
I hope that if this undergraduate project you mention is something serious,
they could consider joining forces with the scipy plotting effort. One of the
problems with open source projects is that it's all too common to see fifteen
half-finished systems with not a single one that really works. In my opinion
the scipy leaders have done a great job of not reinventing any wheels
unnecessarily, trying to leverage all existing code (when feasible). But now
plotting is an area of scipy which isn't recieving excessive attention
(because people are super-busy with other things), so I'm sure that outside
hands would be appreciated. At least it would be worth discussing it.
The hope (at least I have) is that in the future, SciPy will be a one-stop
solution for a solid, comprehensive scientific computing environment with
Python. But for that to become a reality various projects should if at all
possible consider joining forces.
> I am not completely sure what the policy and rationale behind the
> current division of labor between functions, methods and attributes in
> numarray is but as far as the lack of a missing transposition postfix
> operator is concerned, one reasonable approach to make the transpose
> operation more readable (without the need to change anything in python
> itself) would seem to me to provide arrays with an attribute .t or .T
> so that:
>
> a.t == transpose(a)
>
> While this would admittedly be a bit of a syntactic hack, the
> operation is so commonplace and the gain in readability (in my eyes)
> is so significant that it would seem to me to merit serious
> consideration (maybe a.cc or so might also be an idea for complex
> conjugate notation, but I'm less convinced about that?).
Well, as others told you using a Matrix instead of an array will get you this
notation. Here's an example:
In [24]: a=Matrix.Matrix([[0,1+1j],[1,3-1j]])
In [25]: a
Out[25]= Matrix([[ 0.+0.j, 1.+1.j],
[ 1.+0.j, 3.-1.j]])
In [26]: a.T # transpose
Out[26]= Matrix([[ 0.+0.j, 1.+0.j],
[ 1.+1.j, 3.-1.j]])
In [27]: a.H # Hermitian conjugate
Out[27]= Matrix([[ 0.-0.j, 1.-0.j],
[ 1.-1.j, 3.+1.j]])
In [28]: a.H.T # complex conjugate, an a.cc notation would be nice
Out[28]= Matrix([[ 0.-0.j, 1.-1.j],
[ 1.-0.j, 3.+1.j]])
Now, I don't know internally how these operations are implemented, they
probably return copies of their objects (slow, expensive for big
matrices). But I was wondering if it would be worth considering making .T and
.cc simple flags. Basically, one could define them via getters (properties in
python 2.2) so that all the above syntax works, but internally the getter for
a.T would return a with a flag for transposition on. Then any matrix operation
could check for whether this flag was on or not and arrange its looping order
to honor the convention without changing ever the data in a. The same thing
could be done for complex conjugation.
This may turn out to be a really stupid idea, but at least I'd like to hear
other's opinions. This would require all the basic operations on matrices
(__getitem__, __add__, __mult__, etc) to check for these flags. But once that
job is done (once), operations involving complex conjugation or transposition
could be done very fast, only keeping copies of the original data (since both
a.T and a.cc can be expressed in terms of a). For the .cc case this would
require writing the basic complex algebra in terms of conjugated numbers, but
that's a minor annoyance.
The tradeoff seems to be one of complexity in the base operations code and a
bit of extra checking (an if test at the start). But for large arrays this may
give significant benefits.
Communication problems: if one needs to pass a.T to an external library which
has no idea of how to handle these flags, there's a problem. If this idea
doesn't have other fundamental flaws (which it probably does), one way around
this crucial problem would be to have _two_ different ways of getting
transpose(a) --similar ideas for conjugation:
- a.T -> 'flagged' object, only works with libraries which know how to handle
it but allows writing clean, fast code with zero-copy.
- a.Transpose() -> makes a copy and returns a normal array with the elements
transposed. Any routine which can handle a numpy array can handle this.
So, I realize this idea may well be fundamentally broken, but it may also have
something of interest in it. Feel free to shoot me down.
Regards,
f.
More information about the SciPy-user
mailing list