[Numpy-discussion] How to limit the numpy.memmap's RAM usage?
Sat Oct 23 11:27:53 CDT 2010
Charles R Harris :
> On Sat, Oct 23, 2010 at 10:15 AM, Charles R Harris
> <firstname.lastname@example.org <mailto:email@example.com>> wrote:
> On Sat, Oct 23, 2010 at 9:44 AM, braingateway
> <firstname.lastname@example.org <mailto:email@example.com>> wrote:
> David Cournapeau :
> 2010/10/23 braingateway <firstname.lastname@example.org
> Hi everyone,
> I noticed the numpy.memmap using RAM to buffer data
> from memmap files.
> If I get a 100GB array in a memmap file and process it
> block by block,
> the RAM usage is going to increasing with the process
> running until
> there is no available space in RAM (4GB), even though
> the block size is
> only 1MB.
> for example:
> a = numpy.memmap(‘a.bin’, dtype='float64', mode='r')
> for i in range(0,len(a)/blocklen):
> Is there any way to restrict the memory usage in
> The whole point of using memmap is to let the OS do the
> buffering for
> you (which is likely to do a better job than you in many
> cases). Which
> OS are you using ? And how do you measure how much memory
> is taken by
> numpy for your array ?
> Hi David,
> I agree with you about the point of using memmap. That is why
> the behavior is so strange to me.
> I actually measure the size of resident set (pink trace in
> figure2) of the python process on Windows. Here I attached the
> result. You can see the RAM usage is definitely not file
> system cache.
> Umm, a good operating system will use *all* of ram for buffering
> because ram is fast and it assumes you are likely to reuse data
> you have already used once. If it needs some memory for something
> else it just writes a page to disk, if dirty, and reads in the new
> data from disk and changes the address of the page. Where you get
> into trouble is if pages can't be evicted for some reason. Most
> modern OS's also have special options available for reading in
> streaming data from disk that can lead to significantly faster
> access for that sort of thing, but I don't think you can do that
> with memmapped files.
> I'm not sure how windows labels it's memory. IIRC, Memmaping a
> file leads to what is called file backed memory, it is essentially
> virtual memory. Now, I won't bet my life that there isn't a
> problem, but I think a misunderstanding of the memory information
> is more likely.
> It is also possible that something else in your program is hanging
> onto memory but without knowing a lot more it is hard to tell. Are you
> seeing symptoms besides the memory graphs? It looks like you aren't
> running on windows, actually, so what OS are you running on?
Thanks a lot for quick response. I do run following supper simple script
a = numpy.memmap(‘a.bin’, dtype='float64', mode='r')
for i in range(0,len(a)/blocklen):
Everything became supper slow after python ate all the RAM.
By the way, I also tried Qt QFile::map() there is no problem at all...
More information about the NumPy-Discussion