[Numpy-discussion] np.memmap and memory usage

Emmanuelle Gouillart emmanuelle.gouillart@normalesup....
Wed Jul 1 08:11:36 CDT 2009


	Hi Pauli,

	thank you for your answer! I was indeed measuring the memory used
with top, which is not the best tool for understanding what really
happens. I monitored "free" during the execution of my program and
indeed, the used numbers on the "+/-buffers/cache" line stays roughly
constant (fluctuations are smaller than say 5Mo). When I close my program
this number gets smaller by a number of the order of magnitude of one
Z-slice of my array (35 Mo), so it is possible that the buffer is always
full. However, as I call memmap.flush, the buffer should be emptied at
each iteration, right? 

	I observed other strange things, for example if I create a first
array with np.memmap, I get a MemoryError if I try to create a second
array (without doing anything with the first array but creating it). The
sum of the sizes of the two files is 3.8 Go, much more than my RAM size,
but as I didn't load anything explicitely into the memory, I don't
understand why it is not possible to have both arrays defined...

	Anyway, I need to read a bit of documentation about data caching
in Linux, as you suggested.

	Thanks again,

	Emmanuelle

> > 	Is this an expected behaviour? How can I reduce the amount of
> > memory used by Ipython and still process my data?

> How do you measure the memory used? Note that on Linux, "top" includes 
> the size of OS caches for the memmap in the RSS size of a process.
> You can try to monitor "free" instead:

> $ free
>              total       used       free     shared    buffers     cached
> Mem:      12300488   11485664     814824          0     642928    7960736
> -/+ buffers/cache:    2882000    9418488
> Swap:      7847712       2428    7845284

> If the memory is used by OS caches, the "used" number on the "-/+ buffers/
> cache" line should stay constant while the program runs.

> In this case, what is most likely actually taking up memory is the OS 
> buffering the data in memory, before writing it to disk. Linux has at 
> least some system-wide parameters available that tune the aggressiveness 
> of data cachine. I suppose there may also be some file-specific settings, 
> but I have no idea what they are.


More information about the Numpy-discussion mailing list