[Numpy-discussion] Getting the indexes of the myarray.min()
southey at uiuc.edu
Thu May 13 14:22:02 CDT 2004
Raymond D. Hettinger is writing a general statistics module 'statistics.py A
collection of functions for summarizing data' that is somewhere in a Python
CVS (I can not find the exact reference but it appeared in a fairly recent
Python thread). He uses a one-pass algorithm from Knuth for the variance that
has good numerical stability.
Below is a rather rough version modified from my situation (masked arrays) which
uses Knuth's algorithm for the variance. It lacks features like checking
dimensions (assumes variance can be computed) and documentation.
#print nrows, ncols
# Create matrices to hold statistics
N_obs =numarray.zeros(ncols, type='Float64')
Sum =numarray.zeros(ncols, type='Float64')
Var =numarray.zeros(ncols, type='Float64')
Min =numarray.zeros(ncols, type='Float64')
Max =numarray.zeros(ncols, type='Float64')
Mean =numarray.zeros(ncols, type='Float64')
AdjM =numarray.zeros(ncols, type='Float64')
NewM =numarray.zeros(ncols, type='Float64')
DifM =numarray.zeros(ncols, type='Float64')
for row in range(nrows):
for col in range(ncols):
N_obs[col] = N_obs[col] + 1
Sum[col] = Sum[col] + t_value
if t_value > Max[col]:
if t_value < Min[col]:
Var[col] = Var[col] +
Mean[col] = NewM[col]
print 'N_obs\n', N_obs
print 'Sum\n', Sum
print 'Mean\n', Mean
print 'Var\n', Var/(nrows-1)
if __name__ == '__main__':
---- Original message ----
>Date: Thu, 13 May 2004 15:42:30 -0400
>From: "Perry Greenfield" <perry at stsci.edu>
>Subject: RE: [Numpy-discussion] Getting the indexes of the myarray.min()
>To: "Russell E Owen" <rowen at u.washington.edu>, "numarray"
<numpy-discussion at lists.sourceforge.net>
>> Russell E Owen wrote:
>> At 9:27 AM -0400 2004-05-13, Perry Greenfield wrote:
>> >... One has to trade off the number of such functions
>> >against the speed savings. Another example is getting max and min values
>> >for an array. I've long thought that this is so often done they could
>> >be done in one pass. There isn't a function that does this yet though.
>> Statistics is another area where multiple return values could be of
>> interest -- one may want the mean and std dev, and making two passes
>> is wasteful (since some of the same info needs to be computed both
>> A do-all function that computes min, min location, max, max location,
>> mean and std dev all at once would be nice (especially if the
>> returned values were accessed by name, rather than just being a tuple
>> of values, so they could be referenced safely and readably).
>> -- Russell
>We will definitely add something like this for 1.0 or 1.1.
>(but probably for min and max location, it will just be
>for the first encountered).
>This SF.Net email is sponsored by: SourceForge.net Broadband
>Sign-up now for SourceForge Broadband and get the fastest
>6.0/768 connection for only $19.95/mo for the first 3 months!
>Numpy-discussion mailing list
>Numpy-discussion at lists.sourceforge.net
More information about the Numpy-discussion