[Numpy-discussion] Anyone have a well-tested SWIG-based C++ STL valarray <=> numpy.array typemap to share?

Charles R Harris charlesr.harris@gmail....
Tue Sep 18 09:24:42 CDT 2007


On 9/17/07, David Cournapeau <david@ar.media.kyoto-u.ac.jp> wrote:
>
> Christopher Barker wrote:
> > David Cournapeau wrote:
> >> Christopher Barker wrote:
> >>> My real question is what compiler and library writers are doing -- has
> >>> anyone (OK, I guess MS and gcc are all I care about anyway) built
> >>> anything optimized for them? Are they going to dump them? Who knows?
> >> What do you mean by optimization ?
> >
> > Well, I'm quite specifically not being precise about that. It appears
> > the POINT of valarray was to provide a way to do computation that
> > compiler(library) writers could optimize in various ways for the system
> > at hand. The one example I have seen is someone that wrote a version
> > that takes advantage of the PPC altivec instructions:
> >
> > (http://www.pixelglow.com/stories/altivec-valarray-2/)
> >
> > Anyway, at this point I'm far less concerned about optimization that
> > just a more robust and convenient way to deal with data that raw
> pointers.
> >
> >> I
> >> remember having used blitz at some point, and I thought it was
> terrible.
> >
> > Darn -- it looks so promising.
> I realize that I sounded more convinced than I really am. First, to make
> my perspective more obvious, let me say that I generally hate template.
> I think the syntax is terrible, and make the code totally unreadable for
> everything but simple cases (simple container, for example); I think it
> is a wrong solution for a broken language. So I prefer to avoid them if
> I can.
>
> My understanding of blitz is that it is supposed to be faster mainly
> because it can avoid temporaries thanks to expression template. So if
> you don't need this feature, you don't gain much. But when you think
> about it, avoiding temporaries is done by symbolic computation at the
> compiler level through template; the idea is to make expressions such as
> A = B * C + D * E^-1 * F where everything is a matrix the most efficient
> possible. C/C++ makes it hard because it needs to use binary operations
> with a returned value. So in the end, this is really a parsing problem;
> if so, why not use a language which can do symbolic computation, and
> convert them into a compiled language ? By using expression template,
> you use a totally broken syntax to do things which are much more easily
> done by a language easy to parse (say LISP).


Templates are a godsend for containers and such using multiple types. I
think that much of Numpy could be naturally written up that way. Template
programming, on the other hand, seems to me an attempt to use the template
mechanism as a compiler. So you are probably right that a different language
would handle that more easily.

When you take a look at
> http://ubiety.uwaterloo.ca/~tveldhui/papers/DrDobbs2/drdobbs2.html, you
> also realize that the tests are done on architectures/compilers which
> are different from the ones available now.
>
> The only way to really know is to do your own tests: have a reasonable
> example of the kind of operations you intend to do, benchmark it, and
> see the differences. My experience says it definitely does not worth it
> for my problems. Maybe yours will be different.
>
> >
> >> I think C++ is much more useful
> >> for the automatic memory management through RAII, which is what
> >> std::vector gives you.
> >
> > and std::valarray not? I guess where I'm at now is deciding if there is
> > any advantage or disadvantage to using std::valarray vs. std::vector.
> > The other option is to go with something else: boost::multiarray,
> > blitz++, etc. However, at least in term of how well they might p;lay
> > with numpy arrays, I don't see a reason to do so.
> Valarray and vector give you more or less the same here concerning RAII.
> But vector really is more common, so I would rather pick up vector



That is my general feeling too, valarrays are the red-headed stepchildren of
the stl.

instead of valarray unless there is a good reason not to do so, not the
> contrary. I don't know much boost::multiarray (I tried a bit ublas, and
> found the performances quite bad compared to C, using gcc; again, this
> was a few years ago, it may have changed since). I almost never used
> more than rank 2 arrays, so I don't know much about multi_array.


I found the performance of ublas to be pretty good for small arrays when it
was compiled with the -NODEBUG option, the assembly code looked pretty good
too. The default with all the bounds checking and such is terrible and the
assembly code practically unreadable (180 character function identifiers,
etc), but for debugging it did its job. The main virtue of ublas is
compactness and readability of code expression, which is far better than
writing out endless numbers of loops. The automatic handling of pointers for
the default allocation type is also convenient and makes it reasonable to
have functions return matrices and vectors.  I still think FORTRAN might be
a better choice than C++ for these sort of problems, it is just that C++ has
become the default for (too) many things.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://projects.scipy.org/pipermail/numpy-discussion/attachments/20070918/2a092215/attachment.html 


More information about the Numpy-discussion mailing list