[SciPy-dev] cleaning out wiki spam
Peter Wang
pwang@enthought....
Tue Feb 24 13:33:05 CST 2009
On Feb 24, 2009, at 12:46 PM, Fernando Perez wrote:
> On Tue, Feb 24, 2009 at 7:47 AM, Peter Wang <pwang@enthought.com>
> wrote:
>
>> In my wild grepping it's possible I've blown away some good pages.
>> I'm including my list of patterns below, so folks can identify major
>> or obvious problems. The sketchiest (but also the most effective)
>> was
>> eliminating pages with '(2b)', but I recognize that was a pretty
>> broad
>> stroke.
>> power*
>> Power*
>
> I would at least double check these. Things like 'power spectrum'
> could have ended up killed by this one.
Indeed. For common english words I was careful to do an ls first and
then "mv -v".
> It may be that with the new moin this approach isn't necessary, but
> for ipython it was the only way to finally eliminate the spam problem.
> And it did, 100%.
I would not be adverse to locking things down a bit; OTOH, if we move
to the new Moin on the new server with CAPTCHAs, that might do most of
the trick.
Incidentally, I went through and cleared out 2500 spam pages from the
ipython wiki directory as well, and moved them into /home/ipython/wiki/
data/badpages. These were done with a much more conservative set of
patterns than what I applied to the main scipy page, and I'm fairly
confident they were all spam (mostly Chinese characters, World of
Warcraft gold, etc.).
-Peter
More information about the Scipy-dev
mailing list