[NumPy-Tickets] [NumPy] #2048: memory overflow when using numpy load in a loop

NumPy Trac numpy-tickets@scipy....
Sat Feb 11 15:36:54 CST 2012


#2048: memory overflow when using numpy load in a loop
---------------------------------------------------+------------------------
 Reporter:  eldada                                 |       Owner:  somebody   
     Type:  defect                                 |      Status:  new        
 Priority:  normal                                 |   Milestone:  Unscheduled
Component:  Other                                  |     Version:  1.5.1      
 Keywords:  load, memory, leak, overflow, NpzFile  |  
---------------------------------------------------+------------------------
 Looping over npz files load causes memory overflow (depending on the file
 list length).
 None of the following seems to help[[BR]]

 1. Deleting the variable which stores the data in the file.[[BR]]

 2. Using mmap.[[BR]]

 3. calling gc.collect() (garbage collection).[[BR]]
 [[BR]]


 The following code should reproduce the phenomenon:[[BR]]


 import numpy as np[[BR]]


 #generate a file for the demo[[BR]]

 X = np.random.randn(1000,1000)[[BR]]

 np.savez('tmp.npz',X=X)[[BR]]
 [[BR]]


 #here come the overflow:[[BR]]

 for i in xrange(1000000):[[BR]]

     data = np.load('tmp.npz')[[BR]]

     data.close()  # avoid the "too many files are open" error[[BR]]

 [[BR]]

 # in my real application the loop is over a list of files and the overflow
 exceeds 24GB of RAM! [[BR]]


 # please note that this was tried on ubuntu 11.10, and for both numpy v
 1.5.1 as well as 1.6.0

-- 
Ticket URL: <http://projects.scipy.org/numpy/ticket/2048>
NumPy <http://projects.scipy.org/numpy>
My example project


More information about the NumPy-Tickets mailing list