[SciPy-dev] cleaning out wiki spam

Peter Wang pwang@enthought....
Tue Feb 24 13:33:05 CST 2009


On Feb 24, 2009, at 12:46 PM, Fernando Perez wrote:

> On Tue, Feb 24, 2009 at 7:47 AM, Peter Wang <pwang@enthought.com>  
> wrote:
>
>> In my wild grepping it's possible I've blown away some good pages.
>> I'm including my list of patterns below, so folks can identify major
>> or obvious problems.  The sketchiest (but also the most effective)  
>> was
>> eliminating pages with '(2b)', but I recognize that was a pretty  
>> broad
>> stroke.
>> power*
>> Power*
>
> I would at least double check these.  Things like 'power spectrum'
> could have ended up killed by this one.

Indeed.  For common english words I was careful to do an ls first and  
then "mv -v".

> It may be that with the new moin this approach isn't necessary, but
> for ipython it was the only way to finally eliminate the spam problem.
> And it did, 100%.


I would not be adverse to locking things down a bit; OTOH, if we move  
to the new Moin on the new server with CAPTCHAs, that might do most of  
the trick.

Incidentally, I went through and cleared out 2500 spam pages from the  
ipython wiki directory as well, and moved them into /home/ipython/wiki/ 
data/badpages.  These were done with a much more conservative set of  
patterns than what I applied to the main scipy page, and I'm fairly  
confident they were all spam (mostly Chinese characters, World of  
Warcraft gold, etc.).


-Peter



More information about the Scipy-dev mailing list