vectorized linalg
David L Goldsmith
David.L.Goldsmith at noaa.gov
Mon Oct 9 12:09:39 CDT 2006
Tim Hochberg wrote:
> David Goldsmith wrote:
>
>> Tim Hochberg wrote:
>>
>>
>>> I periodically need to perform various linalg operations on a large
>>> number of small matrices. The normal numpy approach of stacking up the
>>> data into an extra dimension and doing the operations in parallel
>>> doesn't work because the linalg functions are only designed to work on
>>> one matrix at a time. At first glance, this restriction seems harmless
>>> enough since the underlying LAPACK functions are also designed to work
>>> one matrix at a time and it seem that there would be little to be gained
>>> by parallelizing the Python level code. However, this turns out to be
>>> the case.
>>>
>>> I have some code, originally written for numarray, but I recently ported
>>> it over to numpy, that rewrites the linalg function determinant, inverse
>>> and solve_linear_equations[1] to operate of 2 or 3D arrays, treating 3D
>>> arrays as stacks of 2D arrays. For operating on large numbers of small
>>> arrays (for example 1000 2x2 arrays), this approach is over an order of
>>> magnitude faster than the obvious map(inverse, threeDArray) approach.
>>>
>>> So, some questions:
>>>
>>> 1. Is there any interest in this code for numpy.linalg? I'd be
>>> willing to clean it up and work on the other functions in linalg so that
>>> they were all consistent. Since these changes should all be backwards
>>> compatible (unless your counting on catching an error from one of the
>>> linalg functions), I don't see this as a problem to put in after 1.0, so
>>> there's really no rush on this. I'm only bringing it up now since I just
>>> ported it over to numpy and it's fresh in my mind.
>>>
>>>
>>>
>> Say "no" to someone offering to make a Python module more feature rich?
>> Blasphemy! :-)
>>
>> But I am very curious as to the source/basis of the performance
>> improvement - did you figure this out?
>>
>>
> Yes and no. Here's what the begining of linalg.det looks lik:
>
> a = asarray(a)
> _assertRank2(a)
> _assertSquareness(a)
> t, result_t = _commonType(a)
> a = _fastCopyAndTranspose(t, a)
> n = a.shape[0]
> if isComplexType(t):
> lapack_routine = lapack_lite.zgetrf
> else:
> lapack_routine = lapack_lite.dgetrf
> # ...
>
> There's quite a few function calls here, since the actual call to dgetrf
> probably takes neglible time for a 2x2 array, the cost of computation
> here is dominated by function call overhead, creating the return arrays
> and such not. Which is the biggest culprit, I'm not sure; by vectorizing
> it, much of that overhead is only incurred once rather than hundreds or
> thousands of times, so it all becomes negligible at that point. I didn't
> bother to track down the worst offender.
>
>
Ah, that all makes sense (including not bothering to profile the code).
DG
> -tim
>
>
>
>
> -------------------------------------------------------------------------
> Take Surveys. Earn Cash. Influence the Future of IT
> Join SourceForge.net's Techsay panel and you'll get the chance to share your
> opinions on IT & business topics through brief surveys -- and earn cash
> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
--
HMRD/ORR/NOS/NOAA <http://response.restoration.noaa.gov/emergencyresponse/>
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
More information about the Numpy-discussion
mailing list