[Numpy-discussion] fast putmask implementation
Eric Firing
Thu Aug 16 14:20:41 CDT 2007
In looking at maskedarray performance, I found that the filled()
function or method is a bottleneck. I think it can be sped up by using
putmask instead of indexed assignment, but I found that putmask itself
is slower than it needs to be. So I followed David Cournapeau's example
of fastclip and made a similar fastputmask. The diff relative to
current svn (3967) is attached.
The faster version makes a factor-of-ten or larger improvement in
putmask speed. numpy.test() still passes.
With 10000-element integer arrays the new version reduces the times from
136 to 15 usec for 1000 masked elements, and 445 to 18 usec for 5000
masked elements, with a scalar value argument. It is only slightly
slower with an array value argument. (Times are for Intel Core2, 2 GH,
linux.)
I hope someone will take a look and either tell me what I need to fix or
commit it as-is.
Thanks.
Eric
