[SciPy-dev] cleaning out wiki spam

Peter Wang pwang@enthought....
Wed Feb 25 06:20:46 CST 2009

On Feb 25, 2009, at 1:20 AM, Fernando Perez wrote:

> Inspired by this, I just went and nuked ~1900 out of the ipython one,
> leaving only the 128 that are probably for real.  I hope this helps
> also reduce the load a bit more.

Great, thank you!  One thing that occurs to me is that once you have a  
fairly high ratio of ham to spam, it might be worth saving the  
directory listing into a base "goodpages.txt" that can then be used as  
a whitelist filter in the future when blowing away spam via regexes.   
(Hopefully we won't have to do that on this scale again, but if  
history is any indicator, spammers always find a way...)


