[SciPy-dev] Reliable way to know memory consumption of functions/scripts/etc..

David Cournapeau david@ar.media.kyoto-u.ac...
Wed Jun 13 04:25:12 CDT 2007


Francesc Altet wrote:
> El dt 12 de 06 del 2007 a les 15:27 +0900, en/na David Cournapeau va
> escriure:
>> Hi,
>>
>> I was wondering whether there was a simple and reliable way to know how 
>> much memory a python script takes between some code boundaries. I don't 
>> need a really precise thing, but more something like how does a given 
>> code scale given its input: does it take the same amount, several times 
>> the same amount, etc... Is this possible in python ?
>
> I don't think this is going to be possible in plain python (in a
> non-debugging version of python at least).  What I normally do is
> 'spying' in real time the process through the 'top' command and infer
> the increment of memory usage doing some experiments sequentially.
> There should be better tools around, though.
In found in between the option COUNT_ALLOCS, which looks exactly like 
what I want, but unfortunately, it crashes when importing numpy, and 
this seems to be non trivial to fix (I stoped digging after half an hour).

"""
---------------------------------------------------------------------------
COUNT_ALLOCS                                            introduced in 0.9.9
                                             partly broken in 2.2 and 2.2.1

Each type object grows three new members:

    /* Number of times an object of this type was allocated. */
    int tp_allocs;

    /* Number of times an object of this type was deallocated. */
    int tp_frees;

    /* Highwater mark:  the maximum value of tp_allocs - tp_frees so
     * far; or, IOW, the largest number of objects of this type alive at
     * the same time.
     */
    int tp_maxalloc;

Allocation and deallocation code keeps these counts up to date.
Py_Finalize() displays a summary of the info returned by sys.getcounts()
(see below), along with assorted other special allocation counts (like
the number of tuple allocations satisfied by a tuple free-list, the number
of 1-character strings allocated, etc).

Before Python 2.2, type objects were immortal, and the COUNT_ALLOCS
implementation relies on that.  As of Python 2.2, heap-allocated type/
class objects can go away.  COUNT_ALLOCS can blow up in 2.2 and 2.2.1
because of this; this was fixed in 2.2.2.  Use of COUNT_ALLOCS makes
all heap-allocated type objects immortal, except for those for which no
object of that type is ever allocated.

Starting with Python 2.3, If Py_TRACE_REFS is also defined, COUNT_ALLOCS
arranges to ensure that the type object for each allocated object
appears in the doubly-linked list of all objects maintained by
Py_TRACE_REFS.

Special gimmicks:

sys.getcounts()
    Return a list of 4-tuples, one entry for each type object for which
    at least one object of that type was allocated.  Each tuple is of
    the form:

        (tp_name, tp_allocs, tp_frees, tp_maxalloc)

    Each distinct type object gets a distinct entry in this list, even
    if two or more type objects have the same tp_name (in which case
    there's no way to distinguish them by looking at this list).  The
    list is ordered by time of first object allocation:  the type object
    for which the first allocation of an object of that type occurred
    most recently is at the front of the list.
--------------------------------------------------------------------------
"""

If someone else more knwoledgeable than me is willing to help, I think 
it would be a great addition for numpy.

David
>
> Incidentally, I've some code that gives you the amount of memory that is
> currently being used by the process in some point of the code, but this
> is different from knowing the amount of memory taken between two points.
> If you are interested on this, tell me (only works on linux, but it
> should be feasible to port it to win).
Well, I don't use windows, so I could use your code :)

David


More information about the Scipy-dev mailing list