[Numpy-discussion] Memory leak found in ndarray (I think)?

Wes McKinney wesmckinn@gmail....
Mon Jul 12 13:22:17 CDT 2010


This one was quite a bear to track down, starting from the of course
very high level observation of "why is my application leaking memory".
I've reproduced it on Windows XP using NumPy 1.3.0 on Python 2.5 and
1.4.1 on Python 2.6 (EPD). Basically it seems that calling
.astype(bool) on an ndarray slice with object dtype is leaving a
hanging reference count, should be pretty obvious to see:

from datetime import datetime
import numpy as np
import sys

def foo(verbose=True):
    arr = np.array([datetime.today() for _ in xrange(1000)])
    arr = arr.reshape((500, 2))
    sl = arr[:, 0]

    if verbose: print 'Rec ct of index 0: %d' % sys.getrefcount(sl[0])

    for _ in xrange(10):
        foo = sl.astype(bool)

    if verbose: print 'Rec ct of index 0: %d' % sys.getrefcount(sl[0])

if __name__ == '__main__':
    foo()
    for i in xrange(10000):
        if not i % 1000: print i
        foo(verbose=False)

On my machine this bleeds about 100 MB of memory that you don't get
back-- let me know if I've misinterpreted the results. I'll happily
create a ticket on the Trac page.

Thanks,
Wes


More information about the NumPy-Discussion mailing list