[Numpy-discussion] Possible improvement to numpy.mean()

Michael Gilbert michael.s.gilbert@gmail....
Tue Apr 6 16:07:30 CDT 2010


Hi,

I am applying Monte Carlo for a problem involving mixed deterministic
and random values.  In order to avoid a lot of special handling and
corner cases, I am using using numpy arrays full of a single value to
represent the deterministic quantities.

Anyway, I found that the standard deviation turns out to be non-zero
when these deterministic sets take on large values, which is wrong.
This is due to machine precision loss.

It turns out to be fairly straightforward to check for this situation
upfront. See attached code. I've also shown a more accurate algorithm
for computing the mean, but it adds an additional multiplication for
every term in the sum, which is obviously undesirable from a
performance perspective. Would it make sense to automatically detect
the precision loss and use the more accurate approach when that is the
case?

If that seems ok, I can take a look at the numpy code, and submit a
patch.

Best wishes,
Mike
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mean-problem
Type: application/octet-stream
Size: 1223 bytes
Desc: not available
Url : http://mail.scipy.org/pipermail/numpy-discussion/attachments/20100406/b0f4c2bb/attachment.obj 


More information about the NumPy-Discussion mailing list