[SciPy-dev] Re: [SciPy-user] Low-level code integration discussion at scipy'03?
eric jones
eric at enthought.com
Wed Sep 3 13:16:11 CDT 2003
Hey Guys,
I'm interested in the conversation also. We can set up a BOF session on
Thursday night for this. Fernando, would you be willing to chair it?
There will be a lot of others available to chime in I'm sure. I don't
have an "official" time yet, but my guess would be that it will start
around 8:00 to 8:30 in the evening and run until people are tired of
talking.
Michael:
Will it be any problem to get a couple of meeting rooms that evening?
thanks,
eric
Fernando Perez wrote:
> Prabhu Ramachandran wrote:
>
>> 1. I guess that weave code will usually be used to optimize the
>> slowest loops. Typically eliminating 'for/while' loops. The
>> trivial (and ancient) Laplacian examples on the SciPy pages
>> demonstrate one such case. My guess is most other uses would be
>> similar. In such cases merely having slices working is not good
>> enough. You might also need to do something non-trivial to the
>> array. At that point I'm not sure how much the niceties of blitz
>> would be used (efficiently)? Could you provide a more
>> illustrative example of what blitz does that could not be done by
>> a simple C++ loop or something?
>
>
> It's not a matter of being impossible in loops, just that the blitz
> syntax is incredibly compact. Consider computing Numeric's
> innerproduct() operation for a rank-2 array M and a rank-6 one T,
> storing the result in U. In blitz, the code reads:
>
> // 6d version
> void mat_ten_inner(Array<double,2>& M,
> Array<double,6>& T, Array<double,6>& U ) {
> firstIndex i1;
> secondIndex i2;
> thirdIndex i3;
> fourthIndex i4;
> fifthIndex i5;
> sixthIndex i6;
> seventhIndex j;
>
> U = sum(M(i1,j)*T(i2,i3,i4,i5,i6,j),j);
> }
>
> That's it. ONE line of actual code! A version of this done with hand
> loops requires obviously 7 nested for loops. Since I needed similar
> code for dimensions 1..6, I wrote preprocessor macros to generate all
> versions from a 'template' (meaning, another macro, not a C++
> template<>). This is, of course, ugly.
>
> Having said that, the above one-liner performs about 3 times slower
> than the loop macro version for d=6. For d=1 they are about even, the
> blitz one-liner gets worse as d increases. I posted to the blitz list
> asking about this, but no response so far.
>
> So, in the end, I'm writing my own loops too :) This is just to point
> that I'm ok with writing loops too, even though I'd love blitz to
> handle things like the above better.
>
> What blitz DOES buy you is A(i,j,k,l...) type indexing of high-rank
> arrays, which is not trivial with the C Numeric API. And that is
> important. Writing loops isn't bad, but all the manual pointer
> arithmetic currently necessary with the existing APIs is rather annoying.
>
> It's a matter of balancing abstraction with performance. Blitz
> provides a lot of abstraction, but performance isn't very good. The
> subset of blitz which gives just indexing (you do the rest yourself)
> seems to maintain pretty good performance in many cases (though I need
> to do more precise benchmarks against fortran).
>
>> 3. If plain old inlined C code does indeed perform better than the
>> blitz equivalents, it sure would be useful to have the cleaner API
>> anyway. Afterall performance is a key issue here. Having a
>> cleaner API (without fancy expression templates but without having
>> to convert between C++ and Python data) would really be useful in
>> any case.
>
>
> This, we agree 100% on. In fact, if the C API offered a way to index
> high rank arrays without manually computing stride offsets on every
> access, that would probably satisfy a significant fraction of usage
> cases. I consider one-liners like the above icing on the cake, but
> A(i,j,k) indexing is really a significant change which makes
> development far more productive.
>
>> Well, I'm sure I would not mind writing this kind of loop (taken from
>> here http://www.scipy.org/site_content/weave/python_performance.html):
>>
>> double tmp, err, diff;
>> err = 0.0;
>> for (int i=1; i<nx-1; ++i) {
>> for (int j=1; j<ny-1; ++j) {
>> tmp = u(i,j);
>> u(i,j) = ((u(i-1,j) + u(i+1,j))*dy2 +
>> (u(i,j-1) + u(i,j+1))*dx2)*dnr_inv;
>> diff = u(i,j) - tmp;
>> err += diff*diff;
>> }
>> }
>
>
> I don't either. As I said, the main consideration for me is clean
> indexing of rank2+ arrays. Everything else I take as extra credit,
> and I can live without. For example, blitz has 'stencils' to do the
> above, and in 3d cases they do simplify writing the code a lot. But I
> haven't tested their performance, and after seeing my innerproduct()
> example above, I'd be at least a bit skeptical. So loops are fine
> with me.
>
> Best,
>
> f.
>
> _______________________________________________
> Scipy-dev mailing list
> Scipy-dev at scipy.net
> http://www.scipy.net/mailman/listinfo/scipy-dev
More information about the Scipy-dev
mailing list