The NumPy Fortran-ordering quiz
Tim Hochberg
tim.hochberg at ieee.org
Wed Oct 18 13:27:38 CDT 2006
Charles R Harris wrote:
>
>
> On 10/18/06, *Tim Hochberg* <tim.hochberg at ieee.org
> <mailto:tim.hochberg at ieee.org>> wrote:
>
> Charles R Harris wrote:
>
> [SNIP]
> >
> > I'm not talking about the keyword in the ravel call, I'm talking
> about
> > the flag in a. The question is: do we *need* a fortran flag. I am
> > argueing not, because the only need is for fortran contiguous
> arrays
> > to pass to fortran function, or translation from fortran contiguous
> > arrays to numpy arrays. What I am saying is that things are
> > unnecessarily complicated. None of the LaPack stuff seems to use
> the
> > Fortran stuff, they just transpose and copy. I don't even think
> I want
> > to change that, because it is *clear* what is going on.
> Interfacing to
> > fortran is all about memory layout, nothing more or less.
> >
>
> Chuck,
>
> There are two things here. One is the order keyword and one is the
> FORTRAN flag. The latter is mainly an optimization for use at the
> C-level so that one doesn't have to check whether a given array is in
> contiguous FORTRAN order by examining the strides, in the same way
> that
> the CONTIGUOUS flag allows you to skip examining the strides when you
> need a contiguous C-order matrix.
>
>
> That sounds like the two flags should be named f-contiguous and
> c-contiguous. Then they would be orthogonal and one could have all
> four combinations. Is that the case now? Perhaps I am misunderstanding
> the meaning of the flags.
That is the case now. The flag names simply mirror their values in C.
Why they have those names in something of a historical accident I
believe. Take a look at this:
>>> array([1,2,3,4]).flags
CONTIGUOUS : True
FORTRAN : True
OWNDATA : True
WRITEABLE : True
ALIGNED : True
UPDATEIFCOPY : False
>>> array([1,2,3,4])[::2].flags
CONTIGUOUS : False
FORTRAN : False
OWNDATA : False
WRITEABLE : True
ALIGNED : True
UPDATEIFCOPY : False
>>> array([[1,2],[3,4]]).flags
CONTIGUOUS : True
FORTRAN : False
OWNDATA : True
WRITEABLE : True
ALIGNED : True
UPDATEIFCOPY : False
>>> array([[1,2],[3,4]], order='F').flags
CONTIGUOUS : False
FORTRAN : True
OWNDATA : True
WRITEABLE : True
ALIGNED : True
UPDATEIFCOPY : False
See, all four combinations. I guess my previous post was wrong -- there
really are four combinations not three like I said, but they are
C-order, Fortran-order, Both and neither. I forgot what the flags
signified and had to play with them for a bit to remember.
> I believe it is the former that you
> are objecting to, but it would help if you could specify whether
> you are
> talking about the order keyword or whether you are talking about the
> FORTRAN flag.
>
>
> Both. I was argueing against the FORTRAN flag, and of limiting the
> order keyword to those cases where f or c contiguous arrays were the
> output or input.
>
> I'll also note that the order keyword could probably have been used to
> fix a performance problem someone was having a few weeks ago. We
> ended
> up transposing the data, but the individual felt that obscured the
> intent of the algorithm. I believe the same effect could probably have
> been been achieved without re jiggering the algorithm by using the
> order
> parameter.
>
>
> Some more details would be helpful. It would be good to know what
> problem the order keyword should solve.
>
Well, in general, memory layout can be important for performance not
just for interfacing with Fortran. You can do this with suitable
applications of transpose, but using the order flag is probably clearer.
Particularly, if you are trying to match a textbook algorithm, its nice
to have the axes in the same places. I'm just moving over from numarray
which didn't have the equivalent of the order flag as far as I know, so
I don't have experience with this at this point though.
Here is one of the posts in questions:
> David Cournapeau wrote:
> Hi,
>
> I was wondering if there was any way to speed up the following code:
>
> y = N.zeros((n, K))
> for i in range(K):
> y[:, i] = gauss_den(data, mu[i, :], va[i, :])
>
> Where K is of order 1e1, n of order 1e5. Normally, gauss_den is a
> quite expensive function, but the profiler tells me that the indexing
> y[:,i] takes almost as much time as the gauss_den computation (which
> computes n exp !). To see if the profiler is "right", i replaces with
> the (non valid) following function:
>
> y = N.zeros((n, K))
> for i in range(K):
> yt = gauss_den(data, mu[i, :], va[i, :])
> return y
>
> Where more than 99% of the code is spent inside gauss_den.
>
> I guess the problem is coming from the fact that y being C order, y[:,
> i] needs accessing data in a non 'linear' way. Is there a way to speed
> this up ? I did something like this:
>
> y = N.zeros((K, n))
> for i in range(K):
> y[i] = gauss_den(data, mu[i, :], va[i, :])
> return y.T
>
> which works, but I don't like it very much.
I believe that the same efficiency as the last could have been achieved
using something like:
y = N.zeros((n,K), order='F')
for i in range(K):
y[:,i] = gauss_den(data, mu[i, :], va[i, :])
return y
This probably would have made the original poster happier.
-tim
-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
More information about the Numpy-discussion
mailing list