# [SciPy-User] memory error - numpy mean - netcdf4

Phil Morefield philmorefield@yahoo....
Fri Aug 26 14:33:53 CDT 2011

"If the values are integers then the running total may overflow."

That's a good point. Though you could just do this:

###################################
import numpy as np

array = netcdf_variable[0]

for i in xrange(1, len(netcdf_variable) - 1, 1):
array = np.true_divide(np.add(array, array[i]), 2.0)
###################################

The formula you have written looks like you're collapsing everything into a single value. I think he's trying to average a bunch of 2D arrays into a single 2D array.

From: srean <srean.list@gmail.com>
To: Phil Morefield <philmorefield@yahoo.com>; SciPy Users List <scipy-user@scipy.org>
Sent: Friday, August 26, 2011 2:00 PM
Subject: Re: [SciPy-User] memory error - numpy mean - netcdf4

Finally, you're getting the MemoryError because you're trying to put an ginormous array into memory all at once. Your OS can't handle it. Just loop through each time step and keep a running total and counter. Then divide your total (which is an array) by your counter (which is an integer or float) and presto: you have your average. It's plenty fast, don't worry.
>

In fact one can even avoid keeping the running total. If the values are integers then the running total may overflow.

Say you have the mean \mu computed from N points and you have a new collection of m points whose mean is t.
Then the mean on the N + m points is:   \mu_{new} = \mu + (m)/(N+m) ( t - \mu)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/scipy-user/attachments/20110826/4625f64d/attachment.html