[SAC] OOM killer on tracsvn

I've found the OOM Killer kicking in often on the TracSVN machine,
and mostly killing Apache:

Sep 1 06:52:56 git kernel: [17261983.933073] Out of memory: Kill process 29329 (apache2) score 254 or sacrifice child
Sep 1 06:57:05 git kernel: [17262225.513234] Out of memory: Kill process 29475 (apache2) score 186 or sacrifice child
Sep 1 07:00:32 git kernel: [17262439.763616] Out of memory: Kill process 29542 (apache2) score 210 or sacrifice child
Sep 1 07:17:39 git kernel: [17263466.984870] Out of memory: Kill process 29594 (apache2) score 267 or sacrifice child
Sep 1 08:53:18 git kernel: [17269205.178188] Out of memory: Kill process 23583 (apache2) score 156 or sacrifice child
Sep 1 08:58:11 git kernel: [17269498.021185] Out of memory: Kill process 23729 (apache2) score 190 or sacrifice child
Sep 1 09:03:43 git kernel: [17269829.824966] Out of memory: Kill process 24118 (apache2) score 222 or sacrifice child
Sep 1 09:08:22 git kernel: [17270109.803999] Out of memory: Kill process 24364 (apache2) score 309 or sacrifice child
Sep 3 11:19:44 git kernel: [17450790.920356] Out of memory: Kill process 4816 (apache2) score 299 or sacrifice child
Sep 3 11:23:10 git kernel: [17450997.481283] Out of memory: Kill process 4916 (apache2) score 201 or sacrifice child
Sep 5 00:21:48 git kernel: [17584109.845014] Out of memory: Kill process 2562 (apache2) score 175 or sacrifice child
Sep 5 00:26:12 git kernel: [17584379.936770] Out of memory: Kill process 2818 (apache2) score 187 or sacrifice child
Sep 5 00:30:36 git kernel: [17584643.198125] Out of memory: Kill process 3351 (apache2) score 243 or sacrifice child
Sep 5 00:40:50 git kernel: [17585255.206892] Out of memory: Kill process 2990 (apache2) score 323 or sacrifice child
Sep 5 09:45:27 git kernel: [17617934.310446] Out of memory: Kill process 8539 (apache2) score 262 or sacrifice child
Sep 6 00:23:53 git kernel: [48504.691523] Out of memory: Kill process 20254 (apache2) score 164 or sacrifice child
Sep 6 00:28:01 git kernel: [48753.099289] Out of memory: Kill process 20330 (apache2) score 159 or sacrifice child
Sep 6 00:32:13 git kernel: [49002.137863] Out of memory: Kill process 20395 (apache2) score 174 or sacrifice child
Sep 6 00:37:17 git kernel: [49309.507685] Out of memory: Kill process 20950 (apache2) score 280 or sacrifice child
Sep 6 01:01:51 git kernel: [50783.765895] Out of memory: Kill process 28945 (apache2) score 197 or sacrifice child
Sep 6 01:07:07 git kernel: [51095.220359] Out of memory: Kill process 29060 (apache2) score 205 or sacrifice child
Sep 6 01:11:02 git kernel: [51333.870239] Out of memory: Kill process 29344 (apache2) score 183 or sacrifice child
Sep 6 01:16:21 git kernel: [51653.844987] Out of memory: Kill process 29159 (apache2) score 323 or sacrifice child

Would it make more sense to disable the OOM killer so to actually
*refuse* to allocate non-available memory rather than pretending
memory is infinite and then killing arbitrary processes ?

If not else, it might give processes a chance to do something upon
ending out of memory (like I guess Apache could spawn less child?).

--strk;

  () Free GIS & Flash consultant/developer
  /\ https://strk.kbt.io/services.html

On Tue, Sep 6, 2016 at 10:25 AM, Sandro Santilli <strk@kbt.io> wrote:

I've found the OOM Killer kicking in often on the TracSVN machine,
and mostly killing Apache:

Sep 1 06:52:56 git kernel: [17261983.933073] Out of memory: Kill process 29329 (apache2) score 254 or sacrifice child

...

Would it make more sense to disable the OOM killer so to actually
*refuse* to allocate non-available memory rather than pretending
memory is infinite and then killing arbitrary processes ?

If not else, it might give processes a chance to do something upon
ending out of memory (like I guess Apache could spawn less child?).

On a machine short of memory I usually run some cronjob which restarts
the apache server just before the OOM killer starts...
In this case it is likely a bad approach.

In the first place we should optimize the Apache settings and DB(s) to
have a smaller memory footprint.

Markus

On Tue, Sep 06, 2016 at 01:44:58PM +0200, Markus Neteler wrote:

On a machine short of memory I usually run some cronjob which restarts
the apache server just before the OOM killer starts...
In this case it is likely a bad approach.

The problem with OOM killer is that it makes it impossible for a
process to take responsibility over memory usage. For example,
an SQL query could just fail (aborting the transaction) if it
would not be able to allocate enough memory, while this is impossible
with an OOM Killer active because the operating system would never
let the process know that there's not enough memory.

In the first place we should optimize the Apache settings and DB(s) to
have a smaller memory footprint.

First step to optimization is understanding what's going on.
Does anyone have an idea about how to tell _which_ Apache-served
applications are taking up the most memory ? It's somewhat easier
when Apache only acts as a proxy (as for the Gogs case), but much
harder with in-process apache extensions (as for trac).

Current situation:

Mem: 8200012k total, 2532680k used, 5667332k free, 52508k buffers
Swap: 4096568k total, 68612k used, 4027956k free, 1271972k cached

Top 21 processes in memory usage order are 21 apache forks, from:

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
31541 www-data 20 0 419m 82m 10m S 0 1.0 0:03.80 /usr/sbin/apache2 -k start

--strk;

On Tue, Sep 06, 2016 at 03:43:51PM +0200, Sandro Santilli wrote:

First step to optimization is understanding what's going on.
Does anyone have an idea about how to tell _which_ Apache-served
applications are taking up the most memory ? It's somewhat easier
when Apache only acts as a proxy (as for the Gogs case), but much
harder with in-process apache extensions (as for trac).

Current situation:

Mem: 8200012k total, 2532680k used, 5667332k free, 52508k buffers
Swap: 4096568k total, 68612k used, 4027956k free, 1271972k cached

Top 21 processes in memory usage order are 21 apache forks, from:

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
31541 www-data 20 0 419m 82m 10m S 0 1.0 0:03.80 /usr/sbin/apache2 -k start

For the record, the Gogs process follows with:

    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
  19736 git 20 0 362m 31m 6300 S 0 0.4 2:47.91 /home/git/gogs/gogs web

--strk;