[SciPy-dev] cleaning out wiki spam
Wed Feb 25 06:20:46 CST 2009
On Feb 25, 2009, at 1:20 AM, Fernando Perez wrote:
> Inspired by this, I just went and nuked ~1900 out of the ipython one,
> leaving only the 128 that are probably for real. I hope this helps
> also reduce the load a bit more.
Great, thank you! One thing that occurs to me is that once you have a
fairly high ratio of ham to spam, it might be worth saving the
directory listing into a base "goodpages.txt" that can then be used as
a whitelist filter in the future when blowing away spam via regexes.
(Hopefully we won't have to do that on this scale again, but if
history is any indicator, spammers always find a way...)
More information about the Scipy-dev