[Numpy-discussion] std(axis=1) memory footprint issues + moving avg / stddev
Travis Oliphant
oliphant.travis at ieee.org
Sun Aug 27 01:49:55 CDT 2006
Torgil Svensson wrote:
> Hi
>
> ndarray.std(axis=1) seems to have memory issues on large 2D-arrays. I
> first thought I had a performance issue but discovered that std() used
> lots of memory and therefore caused lots of swapping.
>
There are certainly lots of intermediate arrays created as the
calculation proceeds. The calculation is not particularly "smart." It
just does the basic averaging and multiplication needed.
> I want to get an array where element i is the stadard deviation of row
> i in the 2D array. Using valgrind on the std() function...
>
> $ valgrind --tool=massif python -c "from numpy import *;
> a=reshape(arange(100000*100),(100000,100)).std(axis=1)"
>
> ... showed me a peak of 200Mb memory while iterating line by line...
>
>
The C-code is basically a directy "translation" of the original Python
code. There are lots of temporaries created (apparently 5 at one point
:-). I did this before I had the _internal.py code in place where I
place Python functions that need to be accessed from C. If I had to do
it over again, I would place the std implementation there where it could
be appropriately optimized.
-Travis
More information about the Numpy-discussion
mailing list