[Numpy-discussion] Re: help in improving data analysis code

gf gyromagnetic at gmail.com
Fri Nov 25 14:19:00 CST 2005


From: Francesc Altet <faltet at ca...>
Re: help in improving data analysis code
2005-11-25 07:32

>>A Divendres 25 Novembre 2005 16:27, Francesc Altet va escriure:
> >      print nn[argsort(abs(nn_c-nn_c.mean()),0)][:-int(sz*0.10),0]
>>
>> Ups. I have had a confusion. This should work better ;-)
>>
>>      print nn[argsort(abs(nn-nn.mean()),0)][:-int(sz*0.10),0]

Hi Francesc,
Thank you for the suggestions.
Your code is performing a different task than mine was. In particular,
I believe it does not 're-mean' the data after removing each point.
However, based on the great ideas from your code, I now have the
function below that looks to be more efficient (although I haven't
measured it).

Any additional suggestions are appreciated.

-g

====

from numarray import argsort, floor, absolute

def eliminate_outliers(data,frac):
    num_to_eliminate = int(floor(data.size())*frac)
    for i in range(num_to_eliminate):
        data = data[argsort(absolute(data-data.mean()),0)][:-1,0]
    return data

if __name__ == "__main__":
    from numarray.mlab import rand
    sz = 100
    nn = rand(sz,1)
    nn[:10] = 20*rand(10,1)
    nn[sz-10:] = -20*rand(10,1)
    print eliminate_outliers(nn,0.10)




More information about the Numpy-discussion mailing list