[Numpy-discussion] std(axis=1) memory footprint issues + moving avg / stddev

Travis Oliphant oliphant.travis at ieee.org
Sun Aug 27 01:49:55 CDT 2006


Torgil Svensson wrote:
> Hi
>
> ndarray.std(axis=1) seems to have memory issues on large 2D-arrays. I
> first thought I had a performance issue but discovered that std() used
> lots of memory and therefore caused lots of swapping.
>   
There are certainly lots of intermediate arrays created as the 
calculation proceeds.  The calculation is not particularly "smart."  It 
just does the basic averaging and multiplication needed.

> I want to get an array where element i is the stadard deviation of row
> i in the 2D array. Using valgrind on the std() function...
>
> $ valgrind --tool=massif python -c "from numpy import *;
> a=reshape(arange(100000*100),(100000,100)).std(axis=1)"
>
> ... showed me a peak of 200Mb memory while iterating line by line...
>
>   
The C-code is basically a directy "translation" of the original Python 
code.  There are lots of temporaries created (apparently 5 at one point 
:-).  I did this before I had the _internal.py code in place where I 
place Python functions that need to be accessed from C.  If I had to do 
it over again, I would place the std implementation there where it could 
be appropriately optimized.



-Travis




More information about the Numpy-discussion mailing list