[SAC] osgeo.org outage + 'fix'

Frank asked me to look into an outage on osgeo. (Or rather, begged for
someone to help him :wink:

I found that there were many network connections open to 202.114.10.251
(via netstat), and then got into HTTP logs to find that that IP address
was repeatedly downloading specific files from the ossim SVN repository.

This is in the same APNIC block that has been giving me problems on
hypercube, so I've set up a 'deny' rule for this IP address in the ossim
SVN repo config:

Order allow,deny
Allow from all
Deny from 202.114.10.251

I've also enabled http://svn.osgeo.org/server-status for the time being:
at some point we should limit this based on IP (possibly just to
localhost), but in the short term, it provides a useful debugging tool
for seeing a quick apache status. (The reason to hide it is that there
is information there that isn't 'public', like which URLs are being
visited. Since I consider most OSGeo information generally public
knowledge, I don't see this as a major risk, but still think that it's
worth being a bit less cavalier when someone has more time: It's in
httpd/conf/httpd.conf, under the sever-status Location block.)

The cause of the problem was simply that the IP address in question was
opening many connections, and holding them open for a long period of
time while downloading even small files. It's not clear why the IP
address/person behind it was doing thi: It behaves somewhat like a robot
gone horribly wrong, but loops again and again, so it's not clear how
one could write a bot *that* bad and not notice. (Also not clear: Why it
is coming from APNIC, a somewhat unlikely candidate for ossim
downloads.) When the number of connections opened got to 50, the apache
server appeared to have 'locked up' due to lack of available children,
only letting traffic in as a remote IP address finally dropped a
connection. Blocking the IP address makes the returned data a very small
403 page, letting the content move in and out easier.

I've checked my changes in, but would invite anyone with more insight to
do a more thorough job: this is the solution I've been using for
hypercube, and it's working somewhat okay, though we should probably
investigate a more automatic solution than "notice the server is dead,
and block the offending IP address."

Regards,
--
Christopher Schmidt
MetaCarta

On Wed, Mar 5, 2008 at 5:14 AM, Christopher Schmidt
<crschmidt@metacarta.com> wrote:
...

though we should probably
investigate a more automatic solution than "notice the server is dead,
and block the offending IP address."

I have used successfully "mod_cband" in the past when some crazy
people tried to download 1TB per day from grass.itc.it:
http://modules.apache.org/search?id=899

You define how much can be downloaded in a given period and
then offending IPs are getting automatically blocked for a definable
period. This could get out the worst offenders at least, setting things
with large margin, without troubling the rest of us.

best
Markus

Christopher Schmidt wrote:

I found that there were many network connections open to 202.114.10.251
(via netstat), and then got into HTTP logs to find that that IP address
was repeatedly downloading specific files from the ossim SVN repository.

Did most hits in the logs have a 206 status? If yes then that's a damn misconfigured download accelerator lanuching dozens of partial downloads on the same file, using up all available children, in addition to using lots of bandwidth... I've had lots of problems with those on the maptools download server.

http://lists.maptools.org/pipermail/ms4w-users/2007-October/000945.html

Daniel
--
Daniel Morissette
http://www.mapgears.com/

On Wed, Mar 05, 2008 at 09:35:20AM -0500, Daniel Morissette wrote:

Christopher Schmidt wrote:
>
>I found that there were many network connections open to 202.114.10.251
>(via netstat), and then got into HTTP logs to find that that IP address
>was repeatedly downloading specific files from the ossim SVN repository.
>

Did most hits in the logs have a 206 status? If yes then that's a damn
misconfigured download accelerator lanuching dozens of partial downloads
on the same file, using up all available children, in addition to using
lots of bandwidth... I've had lots of problems with those on the
maptools download server.

Nope. They were all 200s, I'm pretty sure. (I don't have a login on the
machine, so I can't double check that: someone else can, looking in
/var/log/httpd/svn-access_log for that IP address.)

Regards,
--
Christopher Schmidt
MetaCarta

I just double checked. All where 200s.

shawn

On 5-Mar-08, at 9:37 AM, Christopher Schmidt wrote:

On Wed, Mar 05, 2008 at 09:35:20AM -0500, Daniel Morissette wrote:

Christopher Schmidt wrote:

I found that there were many network connections open to 202.114.10.251
(via netstat), and then got into HTTP logs to find that that IP address
was repeatedly downloading specific files from the ossim SVN repository.

Did most hits in the logs have a 206 status? If yes then that's a damn
misconfigured download accelerator lanuching dozens of partial downloads
on the same file, using up all available children, in addition to using
lots of bandwidth... I've had lots of problems with those on the
maptools download server.

Nope. They were all 200s, I'm pretty sure. (I don't have a login on the
machine, so I can't double check that: someone else can, looking in
/var/log/httpd/svn-access_log for that IP address.)

Regards,
--
Christopher Schmidt
MetaCarta
_______________________________________________
Sac mailing list
Sac@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/sac

On Tue, Mar 04, 2008 at 11:14:46PM -0500, Christopher Schmidt wrote:

I found that there were many network connections open to 202.114.10.251
(via netstat), and then got into HTTP logs to find that that IP address
was repeatedly downloading specific files from the ossim SVN repository.

Are you talking about 'svn.osgeo.org' ? BTW, there's a funny thing
about this address, as the reverse lookup of 66.223.95.245 points to a
totally different hostname :slight_smile:

foehn: 21:10:19 ~> nslookup 66.223.95.245
[...]
Name: webmail.danvillestation.com
Address: 66.223.95.245

Cheers,
  Martin.
--
Unix _IS_ user friendly - it's just selective about who its friends are !
--------------------------------------------------------------------------

On Mar 5, 2008, at 2:11 PM, Martin Spott wrote:

On Tue, Mar 04, 2008 at 11:14:46PM -0500, Christopher Schmidt wrote:

I found that there were many network connections open to 202.114.10.251
(via netstat), and then got into HTTP logs to find that that IP address
was repeatedly downloading specific files from the ossim SVN repository.

Are you talking about 'svn.osgeo.org' ? BTW, there's a funny thing
about this address, as the reverse lookup of 66.223.95.245 points to a
totally different hostname :slight_smile:

foehn: 21:10:19 ~> nslookup 66.223.95.245
[...]
Name: webmail.danvillestation.com
Address: 66.223.95.245

Hmm, must be a reused IP address within peer1. Can someone who's dealt with Peer1 before request that they change the reverse dns of this address? I think it should be osgeo1.osgeo.org or something like that.

Howard

On Mar 5, 2008, at 2:11 PM, Martin Spott wrote:

On Tue, Mar 04, 2008 at 11:14:46PM -0500, Christopher Schmidt wrote:

I found that there were many network connections open to 202.114.10.251
(via netstat), and then got into HTTP logs to find that that IP address
was repeatedly downloading specific files from the ossim SVN repository.

Are you talking about 'svn.osgeo.org' ? BTW, there's a funny thing
about this address, as the reverse lookup of 66.223.95.245 points to a
totally different hostname :slight_smile:

foehn: 21:10:19 ~> nslookup 66.223.95.245
[...]
Name: webmail.danvillestation.com
Address: 66.223.95.245

Hmm, must be a reused IP address within peer1. Can someone who's dealt with Peer1 before request that they change the reverse dns of this address? I think it should be osgeo1.osgeo.org or something like that.

Ok, I have filed a ticket in hopes that they will change it to point to osgeo1.osgeo.org (after adding a DNS record for osgeo1.osgeo.org that points to .245).

Hope this does more good than harm :slight_smile:

Howard

On Wed, Mar 05, 2008 at 02:37:27PM -0600, Howard Butler wrote:

Ok, I have filed a ticket in hopes that they will change it to point
to osgeo1.osgeo.org (after adding a DNS record for osgeo1.osgeo.org
that points to .245).

Hmm, the _machine_ 'osgeo1' has IP .242 .... but is listed as 'osgeo'
in DNS. Something here doesn't sound consistent to me :slight_smile:

Cheers,
  Martin.
--
Unix _IS_ user friendly - it's just selective about who its friends are !
--------------------------------------------------------------------------