[SAC] Poorly Behaved Spiders, and a Dangerous Trap

Folks,

We have had serious problems in recent days with load on www.osgeo.org which
I believe relates to our old friend - spiders pulling *huge* subversion
changesets out through Trac. This is already forbidden by the /robots.txt
so only poorly behaved spiders are doing this.

Per the suggestions at:
   http://www.leekillough.com/robots.html

I have put a "spider trap" into place that should capture the IPs of
spiders ignoring the robots.txt and then use those IPs to forbid further
access to the trac.osgeo.org domain. Details are in the bug report at:

   http://trac.osgeo.org/osgeo/ticket/140

The IPs are recorded in:

   /var/www/trac/forbidden_ips.txt

Should trac.osgeo.org suddenly stop working for anyone, we should take a
peak in there to see if that is why.

Best regards,
--
---------------------------------------+--------------------------------------
I set the clouds in motion - turn up | Frank Warmerdam, warmerdam@pobox.com
light and sound - activate the windows | http://pobox.com/~warmerdam
and watch the world go round - Rush | President OSGeo, http://osgeo.org

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Frank,

That's great. I also opened a ticket on this but, in relation to a
specific spider. Twiceler. I'm leaving this ticket open for a few days
and will close if i don't see twiceler returning.

http://trac.osgeo.org/osgeo/ticket/139

shawn

Frank Warmerdam wrote:

Folks,

We have had serious problems in recent days with load on www.osgeo.org
which
I believe relates to our old friend - spiders pulling *huge* subversion
changesets out through Trac. This is already forbidden by the /robots.txt
so only poorly behaved spiders are doing this.

Per the suggestions at:
  http://www.leekillough.com/robots.html

I have put a "spider trap" into place that should capture the IPs of
spiders ignoring the robots.txt and then use those IPs to forbid further
access to the trac.osgeo.org domain. Details are in the bug report at:

  http://trac.osgeo.org/osgeo/ticket/140

The IPs are recorded in:

  /var/www/trac/forbidden_ips.txt

Should trac.osgeo.org suddenly stop working for anyone, we should take a
peak in there to see if that is why.

Best regards,

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG1WaKhzE9g90MFjcRAipPAJ9/vHlEnp7mi6mldoKDbeeky/JXCACeO93H
GxEa80ZK5GlPyEaVcATIYYQ=
=Sdq6
-----END PGP SIGNATURE-----