[SAC] http://download.osgeo.org doesn't respond

Hi all,

The host can be pinged but http access hangs forever

Even

Spatialys - Geospatial professional services

http://www.spatialys.com

Even Rouault <even.rouault@spatialys.com> schrieb am Sa., 12. Mai 2018, 13:08:

Hi all,

The host can be pinged but http access hangs forever

Even

I have restarted apache, the server load was at 267.

The server should be back now.

Markus

On 12-05-18 12:45, Even Rouault wrote:

Hi all,

The host can be pinged but http access hangs forever

I do not have sudo rights there, but a quick look shows:

- quiet cpu
- 'df -h' does not return ??
- dmsg shows a lot:

[19270307.492129] TCP: Peer
0000:0000:0000:0000:0000:ffff:67ff:0653:43205/80 unexpectedly shrunk
window 2754254510:2754258710 (repaired)

Which according to this msg:

https://security.stackexchange.com/questions/24410/tcp-peer-unexpectedly-shrunk-window-messages-in-dmesg-log

could be an attack or not...

dmesg also shows:
[20098091.200120] INFO: task apache2:22555 blocked for more than 120
seconds.
[20098091.200657] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.

But I cannot restart apache myself.

Admins?

Regards,

Richard Duivenvoorde

The server should be back now.

Thanks. Alas, the issue has now re-appeared.

Spatialys - Geospatial professional services

http://www.spatialys.com

I noticed there were about 30 vsftpd server processes running under account ftp and account nobody. I killed all of them and shutoff the vsftpd service.

I thought we just use https/http/and sftp-server for downloads or is that service used for something else.

I thought people download via https/http and people upload via sftp-server?

I also restarted apache2 service.

At the moment https://download.osgeo.org/gdal/ seems to be working again

I was going to keep vsftpd off for a bit to see if it was the cause of some of this.

Thanks,

Regina

From: Sac [mailto:sac-bounces@lists.osgeo.org] On Behalf Of Even Rouault
Sent: Saturday, May 12, 2018 11:18 AM
To: sac@lists.osgeo.org
Subject: Re: [SAC] http://download.osgeo.org doesn’t respond

The server should be back now.

Thanks. Alas, the issue has now re-appeared.

Spatialys - Geospatial professional services

http://www.spatialys.com

On Sat, May 12, 2018 at 8:24 PM, Regina Obe <lr@pcorp.us> wrote:

I noticed there were about 30 vsftpd server processes running under account
ftp and account nobody. I killed all of them and shutoff the vsftpd
service.

I thought we just use https/http/and sftp-server for downloads or is that
service used for something else.

I thought people download via https/http and people upload via sftp-server?

I'm not aware of any FTP service OSGeo would need.

Still the CPU load is again very high.

I checked some directories and see that the NFS mounted
/osgeo/download cannot be read:

download:/osgeo/download# time -p ls -la
^C
real 157.36
user 0.00
sys 0.00

This leads to rsync hanging:
jef 8675 0.0 0.0 11600 852 ? DNs 05:13 0:00 rsync
--server -vre.iLsfxC --delay-updates --remove-source-files .
osgeo4w/x86//release/

I guess the NFS daemon should be restarted?

Anyone?

Markus

On 5/12/18 3:42 PM, Markus Neteler wrote:

I guess the NFS daemon should be restarted?

I have attempted to do this. I have also restarted apache2.

The server is definitely not happy.

On 5/12/18 5:58 PM, Howard Butler wrote:

On 5/12/18 3:42 PM, Markus Neteler wrote:

I guess the NFS daemon should be restarted?

I have attempted to do this. I have also restarted apache2.

The server is definitely not happy.

After bouncing apache2, the load started rising quickly.

Howard Butler <howard@hobu.co> schrieb am So., 13. Mai 2018, 01:04:

The server is definitely not happy.
After bouncing apache2, the load started rising quickly.

What about a server reboot?

Markus

Markus Neteler wrote:

On Sat, May 12, 2018 at 8:24 PM, Regina Obe <lr@pcorp.us> wrote:

I noticed there were about 30 vsftpd server processes running under account
ftp and account nobody. I killed all of them and shutoff the vsftpd
service.

I thought we just use https/http/and sftp-server for downloads or is that
service used for something else.

I thought people download via https/http and people upload via sftp-server?

I'm not aware of any FTP service OSGeo would need.

Still the CPU load is again very high.

I checked some directories and see that the NFS mounted
/osgeo/download cannot be read:

download:/osgeo/download# time -p ls -la
^C
real 157.36
user 0.00
sys 0.00

This leads to rsync hanging:
jef 8675 0.0 0.0 11600 852 ? DNs 05:13 0:00 rsync
--server -vre.iLsfxC --delay-updates --remove-source-files .
osgeo4w/x86//release/

I guess the NFS daemon should be restarted?

Ok, aparently "download" was mounting from "osgeo6" via NFS and, at the same
time, exporting not only via HTTP but via FTP and Rsync as well.

After removing "rpcbind" both on "download" and "osgeo6", the NFS mount
stalled and all dependent processes were stuck.
It looks like the NFS mount was set up on Okt 19, 2017. I hope I didn't do
this myself .... At least this explains why "rpcbind" was installed on
exactly these two machines, which actually made me feel surprised when I
noticed its existence.

Cheers,
  Martin.
--
Unix _IS_ user friendly - it's just selective about who its friends are !
--------------------------------------------------------------------------

Hi Martin,

On Sun, May 13, 2018 at 2:16 PM, Martin Spott <Martin.Spott@mgras.net> wrote:

Markus Neteler wrote:

I checked some directories and see that the NFS mounted
/osgeo/download cannot be read:

download:/osgeo/download# time -p ls -la
^C
real 157.36
user 0.00
sys 0.00

This leads to rsync hanging:
jef 8675 0.0 0.0 11600 852 ? DNs 05:13 0:00 rsync
--server -vre.iLsfxC --delay-updates --remove-source-files .
osgeo4w/x86//release/

I guess the NFS daemon should be restarted?

Ok, aparently "download" was mounting from "osgeo6" via NFS and, at the same
time, exporting not only via HTTP but via FTP and Rsync as well.

After removing "rpcbind" both on "download" and "osgeo6", the NFS mount
stalled and all dependent processes were stuck.
It looks like the NFS mount was set up on Okt 19, 2017. I hope I didn't do
this myself .... At least this explains why "rpcbind" was installed on
exactly these two machines, which actually made me feel surprised when I
noticed its existence.

The mount was actually needed... now osgeo4w is 403 on download.osgeo.org.

I have created a new ticket: #2164.

Markus

Hi Martin,

On Fri, 20. Oct 2017 at 00:00:57 +0200, Jürgen E. Fischer wrote:

On Thu, 19. Oct 2017 at 09:47:36 -0700, Alex M wrote:
> An NFS mount from osgeo6 is also an option. Could be done all by us.

Done. /dev/ogdata/download created on osgeo6 (100G), nfs exported, mounted on
download:/osgeo/download/download6, osgeo4w and qgis copied (42G) to it and
symlinked to the original spot.

For now qgis and osgeo4w were moved to a subdirectory named disabled - I'll
remove it if nothing else pops up for a while...

Not sure what we need to do on on backup - there's enough space to hold more
copies in the /mirror directory - and it already carries a lot of cruft that is
long gone from download.

On Sun, 13. May 2018 at 12:16:44 +0000, Martin Spott wrote:

Ok, aparently "download" was mounting from "osgeo6" via NFS and, at the same
time, exporting not only via HTTP but via FTP and Rsync as well.

After removing "rpcbind" both on "download" and "osgeo6", the NFS mount
stalled and all dependent processes were stuck.
It looks like the NFS mount was set up on Okt 19, 2017. I hope I didn't do
this myself .... At least this explains why "rpcbind" was installed on
exactly these two machines, which actually made me feel surprised when I
noticed its existence.

I'm suprised that you're suprised - as it's no news.

How do we deal with this? The space is needed.

Jürgen

--
Jürgen E. Fischer norBIT GmbH Tel. +49-4931-918175-31
Dipl.-Inf. (FH) Rheinstraße 13 Fax. +49-4931-918175-50
Software Engineer D-26506 Norden http://www.norbit.de

On 05/13/2018 09:29 AM, Jürgen E. Fischer wrote:

Hi Martin,

On Fri, 20. Oct 2017 at 00:00:57 +0200, Jürgen E. Fischer wrote:

On Thu, 19. Oct 2017 at 09:47:36 -0700, Alex M wrote:

An NFS mount from osgeo6 is also an option. Could be done all by us.

Done. /dev/ogdata/download created on osgeo6 (100G), nfs exported, mounted on
download:/osgeo/download/download6, osgeo4w and qgis copied (42G) to it and
symlinked to the original spot.

For now qgis and osgeo4w were moved to a subdirectory named disabled - I'll
remove it if nothing else pops up for a while...

Not sure what we need to do on on backup - there's enough space to hold more
copies in the /mirror directory - and it already carries a lot of cruft that is
long gone from download.

On Sun, 13. May 2018 at 12:16:44 +0000, Martin Spott wrote:

Ok, aparently "download" was mounting from "osgeo6" via NFS and, at the same
time, exporting not only via HTTP but via FTP and Rsync as well.

After removing "rpcbind" both on "download" and "osgeo6", the NFS mount
stalled and all dependent processes were stuck.
It looks like the NFS mount was set up on Okt 19, 2017. I hope I didn't do
this myself .... At least this explains why "rpcbind" was installed on
exactly these two machines, which actually made me feel surprised when I
noticed its existence.

I'm suprised that you're suprised - as it's no news.

How do we deal with this? The space is needed.

Jürgen

The new server, which is shipping soon will have TB of space for
downloads. Not sure what the short time fix is.

Thanks,
Alex

Hi Alex,

On Sun, 13. May 2018 at 11:39:26 -0700, Alex Mandel wrote:

The new server, which is shipping soon will have TB of space for
downloads.

osgeo6 already has plenty of space.

Not sure what the short time fix is.

I reinstated NFS and blocked port 111 udp/tcp except from/to osgeo6/download.

Jürgen

--
Jürgen E. Fischer norBIT GmbH Tel. +49-4931-918175-31
Dipl.-Inf. (FH) Rheinstraße 13 Fax. +49-4931-918175-50
Software Engineer D-26506 Norden http://www.norbit.de

On Sun, May 13, 2018 at 10:43:16PM +0200, Jürgen E. Fischer wrote:

I reinstated NFS and blocked port 111 udp/tcp except from/to osgeo6/download.

Thank you Jürgen, great solution !

--strk;

Sandro Santilli wrote:

On Sun, May 13, 2018 at 10:43:16PM +0200, Jürgen E. Fischer wrote:

I reinstated NFS and blocked port 111 udp/tcp except from/to osgeo6/download.

Thank you Jürgen, great solution !

I disagree, this is a "not-that-great" solution because the entire NFS-setup
adds unnecessary security risks as well as komplexity where a simple reverse
proxy would have been sufficient for serving the files to the user.

  Martin.
--
Unix _IS_ user friendly - it's just selective about who its friends are !
--------------------------------------------------------------------------

On Mon, May 14, 2018 at 02:40:11PM +0000, Martin Spott wrote:

Sandro Santilli wrote:
> On Sun, May 13, 2018 at 10:43:16PM +0200, Jürgen E. Fischer wrote:
>>
>> I reinstated NFS and blocked port 111 udp/tcp except from/to osgeo6/download.
>
> Thank you Jürgen, great solution !

I disagree, this is a "not-that-great" solution because the entire NFS-setup
adds unnecessary security risks as well as komplexity where a simple reverse
proxy would have been sufficient for serving the files to the user.

What I found great was his ability to do something quickly :slight_smile:

If we want to find another solution we can do it now with less
pressure.

Reverse proxy
    pro: doesn't need another service
    con: uploaders need to know where to upload things ?
    con: backup configuration is more complex ?

NFS:
    pro: transparent to web server
    pro: transparent to uploader
    con: introduces a new service

Please add your cons/pros!

--strk;

Hi Martin,

On Mon, 14. May 2018 at 14:40:11 +0000, Martin Spott wrote:

I disagree, this is a "not-that-great" solution because the entire NFS-setup
adds unnecessary security risks as well as komplexity where a simple reverse
proxy would have been sufficient for serving the files to the user.

A reverse proxy wouldn't work - at least not directly - because we upload to
download and the scripts maintaining the package list also run on download and
expect files there and the mirrors use rsync.

I wonder why you never participated in the "no space left on device" thread
that started back in 2016 (and even back then this was nothing new), although
you were mentioned a couple of times as the only one that had privileges to do
the cleaner and easier solution of just expanding the vm's disk.

NFS is just the second "best" solution - but was in reach. And now it's in use
simply killing it without a replacement disrupts service, which is
"not-that-great" either - or better put IMHO no option at all.

Jürgen

--
Jürgen E. Fischer norBIT GmbH Tel. +49-4931-918175-31
Dipl.-Inf. (FH) Rheinstraße 13 Fax. +49-4931-918175-50
Software Engineer D-26506 Norden http://www.norbit.de

Jürgen E. Fischer wrote:

A reverse proxy wouldn't work - at least not directly - because we upload to
download and the scripts maintaining the package list also run on download and
expect files there and the mirrors use rsync.

Why don't you simply do the scripting off-site and just upload the results ?

If this wasn't clear before: We're _not_ talking about a cosmetic issue,
instead, opening the (kernel) NFS server to the entire world is sort of a
major security risk. In times when people are discussing even tighter
restrictions on SSH connections you're introducing RSH-level security.
And, just for the record, my surprise came from realizing that people still
do this on The Internet - in 2018.

Apparently we're driven by different objectives. To me it looks like you're
tolerating even major pain for the sake of offering the complete download
portofolio. But what's your plan when someone takes Osgeo6 down via the
security holes you added ?

That's my concern: Keeping the attack surface as small as possible (BTW, you
only blocked the RPC port on Osgeo6 but didn't care about NFS) in order to
maintain stability. And I might consider joining the group of those who are
in favour of restricting root access on OSGeo infrastructure to only those
people who've proven to understand the security implications of their
doings.

I wonder why you never participated in the "no space left on device" thread
that started back in 2016 (and even back then this was nothing new), although
you were mentioned a couple of times as the only one that had privileges to do
the cleaner and easier solution of just expanding the vm's disk.

I don't remember the details, but a good guess would be that I can't afford
the time to follow more than just a fraction of the list threads. And, BTW,
I could only add disk space if such thing is available, not if all space is
already occupied.

NFS is just the second "best" solution - but was in reach. And now it's in use
simply killing it without a replacement disrupts service, which is
"not-that-great" either - or better put IMHO no option at all.

I'm certainly not going to enter an edit-war on Osgeo6 RPC/NFS
configuration, but I'd like to point out, as explained above, that
disrupting parts (we're talking about a sub-section only !) of the download
service is still a lot better than running an unprotected NFS server on The
Internet.

Cheers,

  Martin
--
Unix _IS_ user friendly - it's just selective about who its friends are !
--------------------------------------------------------------------------

Hi,

On Tue, 15. May 2018 at 12:05:51 +0000, Martin Spott wrote:

> A reverse proxy wouldn't work - at least not directly - because we upload to
> download and the scripts maintaining the package list also run on download and
> expect files there and the mirrors use rsync.

Why don't you simply do the scripting off-site and just upload the results ?

Because I'm not the only one uploading packages - and updating the package list
would require a full update mirror for every contributor.

Do we really need to take these workarounds further instead of just adding
space to download somehow?

Is all space on download's host actually taken? Not sure which it is and if
there were already VMs moved elsewhere.

Other options? NFS over VPN?

Jürgen

--
Jürgen E. Fischer norBIT GmbH Tel. +49-4931-918175-31
Dipl.-Inf. (FH) Rheinstraße 13 Fax. +49-4931-918175-50
Software Engineer D-26506 Norden http://www.norbit.de