[SciPy-user] Arrays and strange memory usage ...

David Cournapeau cournape@gmail....
Tue Sep 2 18:19:10 CDT 2008


On Wed, Sep 3, 2008 at 2:11 AM, christophe grimault
<christophe.grimault@novagrid.com> wrote:
> Hi,
>
> I have a application that is very demanding in memory ressources. So I
> started to to look closer at python + numpy/scipy as far as memory is
> concerned.

I you are really tight on memory, you will have problems with python
and most programming language which do not let you control memory in a
fine grained manner. Now, it depends on what you mean by memory
demanding: if you have barely enough memory for holding your data, it
will extremely difficult to do it in python, and difficult to do in
any language, including C and other manually managed languages.

>
> I can't explain the following :
>
> I start my python, + import scipy. A 'top' in the console shows that :
>
>  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME COMMAND
> 14791 grimault  20   0 21624 8044 3200 S    0  0.4   0:00.43 python
>
> Now after typing :
>
> z = scipy.arange(1000000)
>
> I get :
> 14791 grimault  20   0 25532  11m 3204 S    0  0.6   0:00.44 python
>
> So the memory increased by ~ 7 Mb. I was expecting 4 Mb since the data
> type is int32, giving 4*1000000 = 4 Mb of memory chunk (in C/C++ at
> least).

a = scipy.arange(1e6)
a.itemsize * a.size

Give me 8e6 bytes. arange is float64 by default, and I get a similar
memory increase (~ 8Mb).

>
> It gets even worse with complex float. I tried :
> z = arange(1000000) + 1j*arange(1000000)
>
> Expecting 8 Mb,

Again, this is strange, it should default to float128. Which version
of numpy/scipy are you using ?

I do not get unexpected results on my machine; results may vary
because memory allocator in python tends to overcommit to avoid
reallocating all the time, but IIRC, data are allocated with malloc
and not the python allocator in numpy. More importantly though, that's
not really representative of a typical numpy program, and would depend
on what you are doing anyway.

> This is very annoying. Can someone explain this ? Is there a way to
> create numpy arrays with the same (approximately ! I know the array
> class adds some overhead...) memory footprint as in C/C++ ?

Arrays themselves have a similar footprint as C/C++ (for big arrays,
where data >> array stucture overhead). But you will quickly find that
depending on what you are doing (linear algebra, for example), you
will need copies. Note that the same problem exists in C/C++, that's
very difficult to avoid (you need things like expression template and
co).

cheers,

David

cheers,

David


More information about the SciPy-user mailing list