[SciPy-dev] Server spam problems spam spam: spam

Robert Kern robert.kern@gmail....
Mon Feb 23 19:58:27 CST 2009

On Mon, Feb 23, 2009 at 19:46, Pauli Virtanen <pav@iki.fi> wrote:
> Sun, 22 Feb 2009 13:40:20 -0800, Michael Abshoff wrote:
> [clip]
>> two tips of fighting spammers from the Sage project's wiki:
>>   * add a list of common Chinese words to LocalBadContent, i.e.
>> http://wiki.sagemath.org/LocalBadContent
>> Also make sure to clean out all the spammer attempts on the hard disk.
>> I.e I deleted 6,000 directories in "pages" of the Cython wiki since Spam
>> attempts are preserved and not actually deleted from disk. If you have a
>> couple ten thousand of those in one directory this might make every wiki
>> access painfully slow and impact the whole server.
> Continuing Gael's work, I tried to expand the LocalBadContent list:
>        http://scipy.org/LocalBadContent
> I wonder how useful this turns out to be in the end, this smells like an
> arms race... I doubt the additions cause problems to real pages, but if
> they do, some of them need to be reverted.
> [Btw, shouldn't LocalBadContent editing be restricted to those in
> EditorGroup? And could my account PauliVirtanen be added in the group?]

Done and done.

> Another thing is that there are apparently ca. 11600 pages in the
> Scipy.org wiki. I'd make a wild guess that at most ~500 of these are
> valid content; the rest is spam. I'm not sure if getting rid of the spam
> pages improves Moin's performance.

Probably. Are you volunteering? Peter can give you a shell account. If
you are willing to take on the other upgrades Michael recommended, to
add the Captcha, for instance, that would go well, too.

> Do we have any valid pages with CJK characters? Much of the spam seems
> Chinese, so mass-deleting at least this portion of it shouldn't be
> impossible to do, given Moin's database format.

The Chinese localized Moin help pages are valid, but that should be it.

Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco

More information about the Scipy-dev mailing list