[SciPy-user] Statistics advise with scipy
Wed Jul 23 10:01:30 CDT 2008
I've had some success with the following:
1. Define a simple statistical model for your data. That is, from the
previous data, define a distribution for the probability of the next
2. Define a cutoff probability separating valid data from outliers.
3. For each datum, compute its probability based on previous data, and
tag it as valid or outlier.
The advantage is that you can start with a simple statistical model (
for example a gaussian centered on the last valid entry ) and
customize it as you find cases that are not well handled.
2008/7/22 didier rano <firstname.lastname@example.org>:
> I haven't found yet a solution to my problem. But I am reading a good
> article about removing
> outliers: http://www.lcgceurope.com/lcgceurope/data/articlestandard//lcgceurope/502001/4509/article.pdf
> Now, I need to experiment methods described in this article.
> Didier Rano
> 2008/7/22 Tim Michelsen <email@example.com>:
>> >> My data is not normal. Do you know robusts method in scipy ? Or maybe
>> >> in an
>> >> other python module ?
>> > Mmh, I'm sure you could implement some yourself. That way, we could
>> > start
>> > another scikits. There are already some winsorization and trimming
>> > functions
>> > in scipy.stats.
>> > Alternatively, you can try to use R and numpy through rpy:
>> > http://rpy.sourceforge.net/
>> may I ask you to give some feedback what method worked for you?
>> I am also working with the problem of removing outliners etc. from data.
>> Thanks in advance,
>> SciPy-user mailing list
> Didier Rano
> SciPy-user mailing list
More information about the SciPy-user