[SAC] Disk space on {download,buildbot}.osgeo.org getting low

Folks,

The telascience blade used for buildbot.osgeo.org, and download.osgeo.org
(and a few other things) is getting low:

Filesystem 1K-blocks Used Available Use% Mounted on
/dev/hda1 25396228 22500748 1584580 94% /

That is, about 1.5GB space left on the 25GB disk. A "du" under /osgeo
where most of our stuff is shows:

[frankw@xblade14-2 osgeo]$ sudo du -s *
Password:
5767100 buildbot
5973336 download
248052 gdal
1127768 mapbuilder
12 mapserver
8 scripts

I think the download space is to be expected, and will inevitably grow
over time. But I'd like to address the buildbot space usage:

1451964 fdo/.
2284764 gdal/.
80 mapguide/.
1265148 mapserver/.
367016 proj.4/.
394328 qgis/.
3468 usr/.
40 www/.

On closer inspection, quite a bit of the space seems to be because we are
keeping buildmaster logs indefinately. Of the 2.3GB of GDAL space, 1.6GB
of it are old logs.

So, first I'd like to suggest to Mateusz that a cronjob or something similar
be put in place to delete buildmaster logs older than say 2 weeks or so.

Second, I think it would be helpful to avoid putting more buildslaves on
buildbot.osgeo.org, and perhaps we should move some of the existing ones off.
How is work going on VMs for buildslave usage going? What system are these
running on? Do these system have lots of disk space?

We aren't in a crisis yet, but I'd like to see the log trimming addressed
immediately. VMs and migrating existing slaves can proceed as time permits
but I'd appreciate a status report.

Best regards,
--
---------------------------------------+--------------------------------------
I set the clouds in motion - turn up | Frank Warmerdam, warmerdam@pobox.com
light and sound - activate the windows | http://pobox.com/~warmerdam
and watch the world go round - Rush | President OSGeo, http://osgeo.org

Frank Warmerdam wrote:

Folks,

The telascience blade used for buildbot.osgeo.org, and download.osgeo.org
(and a few other things) is getting low:

Filesystem 1K-blocks Used Available Use% Mounted on
/dev/hda1 25396228 22500748 1584580 94% /

That is, about 1.5GB space left on the 25GB disk. A "du" under /osgeo
where most of our stuff is shows:

Frank,

I alarmed this when we reached 2.5 GB and it was ~2 weeks ago.
Now, we have 1.5GB and looks like it will be growing.

I think the download space is to be expected, and will inevitably grow
over time. But I'd like to address the buildbot space usage:

1451964 fdo/.
2284764 gdal/.
80 mapguide/.
1265148 mapserver/.
367016 proj.4/.
394328 qgis/.
3468 usr/.
40 www/.

On closer inspection, quite a bit of the space seems to be because we are
keeping buildmaster logs indefinately. Of the 2.3GB of GDAL space, 1.6GB
of it are old logs.

Yes, this diagnosis is correct.

So, first I'd like to suggest to Mateusz that a cronjob or something
similar be put in place to delete buildmaster logs older
than say 2 weeks or so.

OK, I will solve it this way.

Second, I think it would be helpful to avoid putting more buildslaves on
buildbot.osgeo.org, and perhaps we should move some of the existing ones
off.

But where?

Generally, the idea is to host 1 master and 1 slave for every project, on
the xblade14-2.

How is work going on VMs for buildslave usage going? What system are
these running on? Do these system have lots of disk space?

Currently, we can host 2-3 VM instances. So, it may be not enough.
I'm not sure what is the name of the host machine, I'm connecting to it
using VNC and Hamachi (VPN).

We aren't in a crisis yet, but I'd like to see the log trimming addressed
immediately.

I'll fix it today after the GDAL meeting.

VMs and migrating existing slaves can proceed as time
permits but I'd appreciate a status report.

First, I think we need to discuss this organization with John, and then
design best solution (1 VM per project, 1 VM per software configuration,
or...).

Cheers
--
Mateusz Loskot
http://mateusz.loskot.net

Frank Warmerdam wrote:

So, first I'd like to suggest to Mateusz that a cronjob or something
similar be put in place to delete buildmaster logs older than say 2 weeks or so.

Frank,

I will work on it tonight.
For now, I cleaned all logs in all BB instances manually and
now we have 3.2G of free space.

Second, I think it would be helpful to avoid putting more buildslaves on
buildbot.osgeo.org, and perhaps we should move some of the existing ones
off.

Technically, it's possible to setup BB slaves on other machines and
connect all to common master. Just point me which machine to use.

How is work going on VMs for buildslave usage going? What system are these
running on? Do these system have lots of disk space?

I agree with this idea, as we've discussed on IRC.
AFAIK, John is getting more RAM soon, then we can run more VMs.
Just for records, here is our discussion of possible configuration:

------------------------------------------------------------------------
Nov 20 18:34:29 <FrankW> BTW, perhaps we can focus on 2 virtual machines
for linux.
Nov 20 18:34:39 <mloskot> sure
Nov 20 18:34:41 <FrankW> One "minimally configured" in terms of
available supporting libraries and services.
Nov 20 18:35:01 <FrankW> And one "maximally configured", including if
eventually possible stufflike postgres, oracle, mysql, etc.
Nov 20 18:35:20 <mloskot> and both instances are accessible for all
projects?
Nov 20 18:35:30 <FrankW> that is my thinking, yes.
Nov 20 18:35:33 <mloskot> yes, makes sense
Nov 20 18:35:59 <mloskot> The simplest solution is to keep 1 VM per
project, but this option will cost most regarding hardware resources
Nov 20 18:36:15 <mloskot> So, 2 VM but usable for as many projects as
possible is best option I think
Nov 20 18:36:32 <FrankW> Yes, I'm concerned having many VMs active would
be demanding, especially in terms of RAM.
Nov 20 18:36:39 <mloskot> right
Nov 20 18:37:16 <FrankW> Projects with special needs can always run
their own slaves for special cases on outside servers.
------------------------------------------------------------------------

Cheers
--
Mateusz Loskot
http://mateusz.loskot.net