[SAC] xblade14-2 at load average: 9.75

Hi xblade14-2 users,

please be so kind to make use of "nice" for your routine jobs.
grass.osgeo.org is partially unresponsive (since we share resources
here):

exit 04:18:04 up 33 days, 5:04, 1 user, load average: 9.75, 8.79, 7.12
USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT
neteler pts/3 host188-108-dyna 04:16 3.00s 0.04s 0.01s w
[neteler@xblade14-2 cronjobs]$ top
top - 04:18:48 up 33 days, 5:04, 1 user, load average: 9.25, 8.78, 7.18
Tasks: 133 total, 4 running, 129 sleeping, 0 stopped, 0 zombie
Cpu(s): 78.4% us, 17.3% sy, 0.0% ni, 0.0% id, 4.3% wa, 0.0% hi, 0.0% si
Mem: 1034320k total, 1018812k used, 15508k free, 31296k buffers
Swap: 2096472k total, 118788k used, 1977684k free, 327736k cached

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
10883 apache 16 0 69360 36m 6436 S 26.9 3.7 16:48.33 httpd
21243 buildbot 18 0 45736 21m 10m R 25.3 2.2 0:19.93 python
30840 apache 15 0 62704 30m 6260 S 15.6 3.0 1:01.47 httpd
8709 pramsey 18 0 24132 14m 2596 D 8.0 1.4 0:03.28 doxygen
8200 mysql 15 0 131m 18m 4212 S 6.6 1.8 321:59.04 mysqld
17918 buildbot 16 0 33640 16m 1808 S 5.0 1.7 20:53.63 buildbot
17923 buildbot 15 0 12744 4188 1028 S 4.7 0.4 9:57.00 buildbot
31458 buildbot 16 0 26648 12m 7708 D 2.0 1.3 0:01.64 python
27007 buildbot 17 0 4208 900 544 D 0.7 0.1 0:04.44 cp
8819 neteler 16 0 4756 1616 1212 R 0.3 0.2 0:00.12 top
    1 root 15 0 1744 512 488 S 0.0 0.0 0:03.06 init
[...]

doxygen, and buildbot should really use a different nice level.
Please review your cronjobs.

xblade14-2 hosts GDAL, GRASS, QGIS and more.

thanks
Markus

On Mon, 1 Sep 2008 13:21:33 +0200, "Markus Neteler" <neteler@osgeo.org>
wrote:

Hi xblade14-2 users,

please be so kind to make use of "nice" for your routine jobs.
grass.osgeo.org is partially unresponsive (since we share resources
here):

exit 04:18:04 up 33 days, 5:04, 1 user, load average: 9.75, 8.79, 7.12
USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT
neteler pts/3 host188-108-dyna 04:16 3.00s 0.04s 0.01s w
[neteler@xblade14-2 cronjobs]$ top
top - 04:18:48 up 33 days, 5:04, 1 user, load average: 9.25, 8.78,

7.18

Tasks: 133 total, 4 running, 129 sleeping, 0 stopped, 0 zombie
Cpu(s): 78.4% us, 17.3% sy, 0.0% ni, 0.0% id, 4.3% wa, 0.0% hi, 0.0%
si
Mem: 1034320k total, 1018812k used, 15508k free, 31296k buffers
Swap: 2096472k total, 118788k used, 1977684k free, 327736k cached

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
10883 apache 16 0 69360 36m 6436 S 26.9 3.7 16:48.33 httpd
21243 buildbot 18 0 45736 21m 10m R 25.3 2.2 0:19.93 python
30840 apache 15 0 62704 30m 6260 S 15.6 3.0 1:01.47 httpd
8709 pramsey 18 0 24132 14m 2596 D 8.0 1.4 0:03.28 doxygen
8200 mysql 15 0 131m 18m 4212 S 6.6 1.8 321:59.04 mysqld
17918 buildbot 16 0 33640 16m 1808 S 5.0 1.7 20:53.63 buildbot
17923 buildbot 15 0 12744 4188 1028 S 4.7 0.4 9:57.00 buildbot
31458 buildbot 16 0 26648 12m 7708 D 2.0 1.3 0:01.64 python
27007 buildbot 17 0 4208 900 544 D 0.7 0.1 0:04.44 cp
8819 neteler 16 0 4756 1616 1212 R 0.3 0.2 0:00.12 top
    1 root 15 0 1744 512 488 S 0.0 0.0 0:03.06 init
[...]

doxygen, and buildbot should really use a different nice level.
Please review your cronjobs.

Markus,

I've restarted all buildbot instance and cleaned logs.

Generally, buildbot scheduled jobs are configured to not to happen at the
same time.
But, commits trigger buildbot jobs and perhaps that was the case here.

Best regards,
--
Mateusz Loskot
http://mateusz.loskot.net

On Mon, Sep 1, 2008 at 1:29 PM, <mateusz@loskot.net> wrote:

On Mon, 1 Sep 2008 13:21:33 +0200, "Markus Neteler" <neteler@osgeo.org>
wrote:

Hi xblade14-2 users,

please be so kind to make use of "nice" for your routine jobs.
grass.osgeo.org is partially unresponsive (since we share resources
here):

exit 04:18:04 up 33 days, 5:04, 1 user, load average: 9.75, 8.79, 7.12
USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT
neteler pts/3 host188-108-dyna 04:16 3.00s 0.04s 0.01s w
[neteler@xblade14-2 cronjobs]$ top
top - 04:18:48 up 33 days, 5:04, 1 user, load average: 9.25, 8.78,

7.18

Tasks: 133 total, 4 running, 129 sleeping, 0 stopped, 0 zombie
Cpu(s): 78.4% us, 17.3% sy, 0.0% ni, 0.0% id, 4.3% wa, 0.0% hi, 0.0%
si
Mem: 1034320k total, 1018812k used, 15508k free, 31296k buffers
Swap: 2096472k total, 118788k used, 1977684k free, 327736k cached

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
10883 apache 16 0 69360 36m 6436 S 26.9 3.7 16:48.33 httpd
21243 buildbot 18 0 45736 21m 10m R 25.3 2.2 0:19.93 python
30840 apache 15 0 62704 30m 6260 S 15.6 3.0 1:01.47 httpd
8709 pramsey 18 0 24132 14m 2596 D 8.0 1.4 0:03.28 doxygen
8200 mysql 15 0 131m 18m 4212 S 6.6 1.8 321:59.04 mysqld
17918 buildbot 16 0 33640 16m 1808 S 5.0 1.7 20:53.63 buildbot
17923 buildbot 15 0 12744 4188 1028 S 4.7 0.4 9:57.00 buildbot
31458 buildbot 16 0 26648 12m 7708 D 2.0 1.3 0:01.64 python
27007 buildbot 17 0 4208 900 544 D 0.7 0.1 0:04.44 cp
8819 neteler 16 0 4756 1616 1212 R 0.3 0.2 0:00.12 top
    1 root 15 0 1744 512 488 S 0.0 0.0 0:03.06 init
[...]

doxygen, and buildbot should really use a different nice level.
Please review your cronjobs.

Markus,

I've restarted all buildbot instance and cleaned logs.

Generally, buildbot scheduled jobs are configured to not to happen at the
same time.
But, commits trigger buildbot jobs and perhaps that was the case here.

Thanks Mateusz, but still nice level is 0.
I feel that it should be more "friendly" to the other users (and it does not
matter of buildbot takes a few minutes more).

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
25762 buildbot 23 0 0 0 0 R 4.3 0.0 0:00.13 cc1
25682 buildbot 25 0 5208 1892 1008 S 3.0 0.2 0:00.09 sh
10300 buildbot 15 0 15260 9032 1544 S 1.0 0.9 0:01.69 buildbot
10304 buildbot 15 0 12348 5764 1000 S 1.0 0.6 0:01.77 buildbot
13360 neteler 16 0 4760 1620 1212 R 0.3 0.2 0:00.26 top

NI = 0 - it would suffice to add "nice" to the launch command.

thanks
Markus

Markus Neteler wrote:

On Mon, Sep 1, 2008 at 1:29 PM, <mateusz@loskot.net> wrote:

On Mon, 1 Sep 2008 13:21:33 +0200, "Markus Neteler" <neteler@osgeo.org>
wrote:

Hi xblade14-2 users,

please be so kind to make use of "nice" for your routine jobs.
grass.osgeo.org is partially unresponsive (since we share resources
here):

exit 04:18:04 up 33 days, 5:04, 1 user, load average: 9.75, 8.79, 7.12
USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT
neteler pts/3 host188-108-dyna 04:16 3.00s 0.04s 0.01s w
[neteler@xblade14-2 cronjobs]$ top
top - 04:18:48 up 33 days, 5:04, 1 user, load average: 9.25, 8.78,

7.18

Tasks: 133 total, 4 running, 129 sleeping, 0 stopped, 0 zombie
Cpu(s): 78.4% us, 17.3% sy, 0.0% ni, 0.0% id, 4.3% wa, 0.0% hi, 0.0%
si
Mem: 1034320k total, 1018812k used, 15508k free, 31296k buffers
Swap: 2096472k total, 118788k used, 1977684k free, 327736k cached

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
10883 apache 16 0 69360 36m 6436 S 26.9 3.7 16:48.33 httpd
21243 buildbot 18 0 45736 21m 10m R 25.3 2.2 0:19.93 python
30840 apache 15 0 62704 30m 6260 S 15.6 3.0 1:01.47 httpd
8709 pramsey 18 0 24132 14m 2596 D 8.0 1.4 0:03.28 doxygen
8200 mysql 15 0 131m 18m 4212 S 6.6 1.8 321:59.04 mysqld
17918 buildbot 16 0 33640 16m 1808 S 5.0 1.7 20:53.63 buildbot
17923 buildbot 15 0 12744 4188 1028 S 4.7 0.4 9:57.00 buildbot
31458 buildbot 16 0 26648 12m 7708 D 2.0 1.3 0:01.64 python
27007 buildbot 17 0 4208 900 544 D 0.7 0.1 0:04.44 cp
8819 neteler 16 0 4756 1616 1212 R 0.3 0.2 0:00.12 top
    1 root 15 0 1744 512 488 S 0.0 0.0 0:03.06 init
[...]

doxygen, and buildbot should really use a different nice level.
Please review your cronjobs.

Markus,

I've restarted all buildbot instance and cleaned logs.

Generally, buildbot scheduled jobs are configured to not to happen at the
same time.
But, commits trigger buildbot jobs and perhaps that was the case here.

Thanks Mateusz, but still nice level is 0.
I feel that it should be more "friendly" to the other users (and it does not
matter of buildbot takes a few minutes more).

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
25762 buildbot 23 0 0 0 0 R 4.3 0.0 0:00.13 cc1
25682 buildbot 25 0 5208 1892 1008 S 3.0 0.2 0:00.09 sh
10300 buildbot 15 0 15260 9032 1544 S 1.0 0.9 0:01.69 buildbot
10304 buildbot 15 0 12348 5764 1000 S 1.0 0.6 0:01.77 buildbot
13360 neteler 16 0 4760 1620 1212 R 0.3 0.2 0:00.26 top

NI = 0 - it would suffice to add "nice" to the launch command.

Markus,

Understood and I have no problem with making buildbot nice.
What priority level would you suggest?

Best regards,
--
Mateusz Loskot, http://mateusz.loskot.net
Charter Member of OSGeo, http://osgeo.org

On Mon, Sep 1, 2008 at 1:53 PM, Mateusz Loskot <mateusz@loskot.net> wrote:

Markus Neteler wrote:

On Mon, Sep 1, 2008 at 1:29 PM, <mateusz@loskot.net> wrote:

On Mon, 1 Sep 2008 13:21:33 +0200, "Markus Neteler" <neteler@osgeo.org> wrote:

Hi xblade14-2 users,

...

Markus,
I've restarted all buildbot instance and cleaned logs.

Generally, buildbot scheduled jobs are configured to not to happen at the
same time.
But, commits trigger buildbot jobs and perhaps that was the case here.

Thanks Mateusz, but still nice level is 0.
I feel that it should be more "friendly" to the other users (and it does
not matter of buildbot takes a few minutes more).

...

NI = 0 - it would suffice to add "nice" to the launch command.

Markus,

Understood and I have no problem with making buildbot nice.
What priority level would you suggest?

The default is 10, so maybe use that? I put it for all GRASS build
jobs.

Best regards,
Markus

Markus Neteler wrote:

Thanks Mateusz, but still nice level is 0.
I feel that it should be more "friendly" to the other users (and it does not
matter of buildbot takes a few minutes more).

ttp://lists.osgeo.org/mailman/listinfo/sac

Markus / Mateusz,

Generally speaking, I think we should work to push buildbot slave activity
off this blade - initially to buildtest.osgeo.org, but possibly other places
as well.

Best regards,

--
---------------------------------------+--------------------------------------
I set the clouds in motion - turn up | Frank Warmerdam, warmerdam@pobox.com
light and sound - activate the windows | http://pobox.com/~warmerdam
and watch the world go round - Rush | Geospatial Programmer for Rent

On Mon, Sep 1, 2008 at 3:42 PM, Frank Warmerdam <warmerdam@pobox.com> wrote:

Markus / Mateusz,

Generally speaking, I think we should work to push buildbot slave activity
off this blade - initially to buildtest.osgeo.org, but possibly other places
as well.

Hi SAC;

xblade14-2 is again sloooowly responding to http requests...
(current load average: 3.29, 2.09, 1.48 due to buildbot)

While I can imagine that it is painful to move it off, just curious
if there are plans to do so. I realize that the recently added
nice level 10 has only little positive effect on Apache.

Another optimization might be to schedule it to the European
night (=Asian morning, European night, US evening) instead of
Asian evening, European day, US morning. Less users would
be affected I assume.

Best regards,
Markus (sorry to be a pain here)

Markus Neteler kirjoitti:

On Mon, Sep 1, 2008 at 3:42 PM, Frank Warmerdam <warmerdam@pobox.com> wrote:
  

Markus / Mateusz,

Generally speaking, I think we should work to push buildbot slave activity
off this blade - initially to buildtest.osgeo.org, but possibly other places
as well.
    
Hi SAC;

xblade14-2 is again sloooowly responding to http requests...
(current load average: 3.29, 2.09, 1.48 due to buildbot)
  
I don't notice this very much when I browse the grass pages while the buildbot is building gdal. The wiki is a bit slow but that's usual I believe. I noticed a cracker bot trying to access buildbot.osgeo.org - failed ssh connects every 3 seconds for some time but that was different time than you were experiencing problems, so probably it wasn't that. Are you sure the problem is xblade14 being slow?

While I can imagine that it is painful to move it off, just curious
if there are plans to do so. I realize that the recently added
nice level 10 has only little positive effect on Apache.
  
I'm scheduled to move some of the slaves to buildtest, away from buildbot. But I need to learn some things first.

Another optimization might be to schedule it to the European
night (=Asian morning, European night, US evening) instead of
Asian evening, European day, US morning. Less users would
be affected I assume.
  
Some rescheduling could be done but for example gdal buildbot runs more or less continuously.

Ari

Best regards,
Markus (sorry to be a pain here)
_______________________________________________
Sac mailing list
Sac@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/sac
  
--
Prof. Ari Jolma
Environmental Management Information Technology
Teknillinen Korkeakoulu / Helsinki University of Technology
tel: +358 9 4511 address: POBox 5300, 02015 TKK, Finland
Email: ari.jolma at tkk.fi URL: http://geoinformatics.tkk.fi