[SAC] Crash, Disk Space, Backups and Load


This evening www.osgeo.org appears to have crashed. Shortly before the
crash the system got to the point where virtually all swap space was used.
It seems it thrashed to the point where there was no more virtual memory

I also discovered that the root partition was fairly close to full. Part
of the problem is that we had accumulated 57GB of mysql-zrm backups, done
every 3 hours, and never cleaned up.

I did a survey of /home/back and found:

   html - I never had the patience to let a du on this directory finish, likely
          somewhere between 2GB and 5GB.
   mailman - 5GB
   mysql-zrm - 57GB
   svn - 12GB
   trac - 2.6GB

I blew away all mysql-zrm backups for February and this clear 20GB of disk

I *suspect* part of our slowness, and load issue is related to all the
disk-to-disk backing up done for huge trees (subversion, html, mailman)
and that it would be very helpful if we could:

  o avoid backuping up unnecessary stuff (like the fdo and mapguide doxygen
    docs under /var/www/html/files.

  o Ensure mysql-zrm (presumably drupal?) backups are rotated appropriately.

  o Backup to a different physical disk - either on the same machine or
    over to "osgeo2" aka test.osgeo.net.

  o manage more incremental backups - perhap via live syncronization of
    subversion for instance.

  o consider dropping to less frequent backups for material of modest
    interest (such as mailman backups).

Some study of what sorts of io are going on during osgeo.org slow periods
would also be helpful.

Best regards,
I set the clouds in motion - turn up | Frank Warmerdam, warmerdam@pobox.com
light and sound - activate the windows | http://pobox.com/~warmerdam
and watch the world go round - Rush | President OSGeo, http://osgeo.org