[NumPy-Tickets] [NumPy] #2048: memory overflow when using numpy load in a loop
NumPy Trac
numpy-tickets@scipy....
Sat Feb 11 15:36:54 CST 2012
#2048: memory overflow when using numpy load in a loop
---------------------------------------------------+------------------------
Reporter: eldada | Owner: somebody
Type: defect | Status: new
Priority: normal | Milestone: Unscheduled
Component: Other | Version: 1.5.1
Keywords: load, memory, leak, overflow, NpzFile |
---------------------------------------------------+------------------------
Looping over npz files load causes memory overflow (depending on the file
list length).
None of the following seems to help[[BR]]
1. Deleting the variable which stores the data in the file.[[BR]]
2. Using mmap.[[BR]]
3. calling gc.collect() (garbage collection).[[BR]]
[[BR]]
The following code should reproduce the phenomenon:[[BR]]
import numpy as np[[BR]]
#generate a file for the demo[[BR]]
X = np.random.randn(1000,1000)[[BR]]
np.savez('tmp.npz',X=X)[[BR]]
[[BR]]
#here come the overflow:[[BR]]
for i in xrange(1000000):[[BR]]
data = np.load('tmp.npz')[[BR]]
data.close() # avoid the "too many files are open" error[[BR]]
[[BR]]
# in my real application the loop is over a list of files and the overflow
exceeds 24GB of RAM! [[BR]]
# please note that this was tried on ubuntu 11.10, and for both numpy v
1.5.1 as well as 1.6.0
--
Ticket URL: <http://projects.scipy.org/numpy/ticket/2048>
NumPy <http://projects.scipy.org/numpy>
My example project
More information about the NumPy-Tickets
mailing list