Thank you. memory cgroup and OOM troubles are handled in linux-mm. On Sun, 29 May 2011 23:24:07 +0100 "Cal Leeming [Simplicity Media Ltd]" <cal.leeming@xxxxxxxxxxxxxxxxxxxxxxxx> wrote: > Some further logs: > ./log/syslog:May 30 07:44:38 vicky kernel: [ 2283.369927] redis-server > invoked oom-killer: gfp_mask=0xd0, order=0, oom_adj=-17 > ./log/syslog:May 30 07:44:38 vicky kernel: [ 2283.369939] > [<ffffffff810b12b7>] ? oom_kill_process+0x82/0x283 > ./log/syslog:May 30 07:44:38 vicky kernel: [ 2283.399285] redis-server > invoked oom-killer: gfp_mask=0xd0, order=0, oom_adj=-17 > ./log/syslog:May 30 07:44:38 vicky kernel: [ 2283.399296] > [<ffffffff810b12b7>] ? oom_kill_process+0x82/0x283 > ./log/syslog:May 30 07:44:38 vicky kernel: [ 2283.428690] redis-server > invoked oom-killer: gfp_mask=0xd0, order=0, oom_adj=-17 > ./log/syslog:May 30 07:44:38 vicky kernel: [ 2283.428702] > [<ffffffff810b12b7>] ? oom_kill_process+0x82/0x283 > ./log/syslog:May 30 07:44:38 vicky kernel: [ 2283.487696] redis-server > invoked oom-killer: gfp_mask=0xd0, order=0, oom_adj=-17 > ./log/syslog:May 30 07:44:38 vicky kernel: [ 2283.487708] > [<ffffffff810b12b7>] ? oom_kill_process+0x82/0x283 > ./log/syslog:May 30 07:44:38 vicky kernel: [ 2283.517023] redis-server > invoked oom-killer: gfp_mask=0xd0, order=0, oom_adj=-17 > ./log/syslog:May 30 07:44:38 vicky kernel: [ 2283.517035] > [<ffffffff810b12b7>] ? oom_kill_process+0x82/0x283 > ./log/syslog:May 30 07:44:38 vicky kernel: [ 2283.546379] redis-server > invoked oom-killer: gfp_mask=0xd0, order=0, oom_adj=-17 > ./log/syslog:May 30 07:44:38 vicky kernel: [ 2283.546391] > [<ffffffff810b12b7>] ? oom_kill_process+0x82/0x283 > ./log/syslog:May 30 07:44:43 vicky kernel: [ 2288.310789] redis-server > invoked oom-killer: gfp_mask=0xd0, order=0, oom_adj=-17 > ./log/syslog:May 30 07:44:43 vicky kernel: [ 2288.310804] > [<ffffffff810b12b7>] ? oom_kill_process+0x82/0x283 > ./log/syslog:May 30 07:44:43 vicky kernel: [ 2288.369918] redis-server > invoked oom-killer: gfp_mask=0xd0, order=0, oom_adj=-17 > ./log/syslog:May 30 07:44:43 vicky kernel: [ 2288.369930] > [<ffffffff810b12b7>] ? oom_kill_process+0x82/0x283 > ./log/syslog:May 30 07:44:43 vicky kernel: [ 2288.399284] redis-server > invoked oom-killer: gfp_mask=0xd0, order=0, oom_adj=-17 > ./log/syslog:May 30 07:44:43 vicky kernel: [ 2288.399296] > [<ffffffff810b12b7>] ? oom_kill_process+0x82/0x283 > ./log/syslog:May 30 07:44:43 vicky kernel: [ 2288.433634] redis-server > invoked oom-killer: gfp_mask=0xd0, order=0, oom_adj=-17 > ./log/syslog:May 30 07:44:43 vicky kernel: [ 2288.433648] > [<ffffffff810b12b7>] ? oom_kill_process+0x82/0x283 > ./log/syslog:May 30 07:44:43 vicky kernel: [ 2288.463947] redis-server > invoked oom-killer: gfp_mask=0xd0, order=0, oom_adj=-17 > ./log/syslog:May 30 07:44:43 vicky kernel: [ 2288.463959] > [<ffffffff810b12b7>] ? oom_kill_process+0x82/0x283 > ./log/syslog:May 30 07:44:43 vicky kernel: [ 2288.493439] redis-server > invoked oom-killer: gfp_mask=0xd0, order=0, oom_adj=-17 > ./log/syslog:May 30 07:44:43 vicky kernel: [ 2288.493451] > [<ffffffff810b12b7>] ? oom_kill_process+0x82/0x283 > > hmm, in short, applications has -17 oom_adj in default with 2.6.32.41 ? AFAIK, no kernel has such crazy settings as default.. Does your 2 kernel uses the same environment/distribution ? Thanks, -Kame > On 29/05/2011 22:50, Cal Leeming [Simplicity Media Ltd] wrote: > > First of all, my apologies if I have submitted this problem to the > > wrong place, spent 20 minutes trying to figure out where it needs to > > be sent, and was still none the wiser. > > > > The problem is related to applying memory limitations within a cgroup. > > If the OOM killer kicks in, it gets stuck in a loop where it tries to > > kill a process which has an oom_adj of -17. This causes an infinite > > loop, which in turn locks up the system. > > > > May 30 03:13:08 vicky kernel: [ 1578.117055] Memory cgroup out of > > memory: kill process 6016 (java) score 0 or a child > > May 30 03:13:08 vicky kernel: [ 1578.117154] Memory cgroup out of > > memory: kill process 6016 (java) score 0 or a child > > May 30 03:13:08 vicky kernel: [ 1578.117248] Memory cgroup out of > > memory: kill process 6016 (java) score 0 or a child > > May 30 03:13:08 vicky kernel: [ 1578.117343] Memory cgroup out of > > memory: kill process 6016 (java) score 0 or a child > > May 30 03:13:08 vicky kernel: [ 1578.117441] Memory cgroup out of > > memory: kill process 6016 (java) score 0 or a child > > > > > > root@vicky [/home/foxx] > uname -a > > Linux vicky 2.6.32.41-grsec #3 SMP Mon May 30 02:34:43 BST 2011 x86_64 > > GNU/Linux > > (this happens on both the grsec patched and non patched 2.6.32.41 kernel) > > > > When this is encountered, the memory usage across the whole server is > > still within limits (not even hitting swap). > > > > The memory configuration for the cgroup/lxc is: > > lxc.cgroup.memory.limit_in_bytes = 3000M > > lxc.cgroup.memory.memsw.limit_in_bytes = 3128M > > > > Now, what is even more strange, is that when running under the > > 2.6.32.28 kernel (both patched and unpatched), this problem doesn't > > happen. However, there is a slight difference between the two kernels. > > The 2.6.32.28 kernel gives a default of 0 in the /proc/X/oom_adj, > > where as the 2.6.32.41 gives a default of -17. I suspect this is the > > root cause of why it's showing in the later kernel, but not the earlier. > > > > To test this theory, I started up the lxc on both servers, and then > > ran a one liner which showed me all the processes with an oom_adj of -17: > > > > (the below is the older/working kernel) > > root@xxxxxxxxxxxxxxxxx [/mnt/encstore/lxc] > uname -a > > Linux courtney.internal 2.6.32.28-grsec #3 SMP Fri Feb 18 16:09:07 GMT > > 2011 x86_64 GNU/Linux > > root@xxxxxxxxxxxxxxxxx [/mnt/encstore/lxc] > for x in `find /proc > > -iname 'oom_adj' | xargs grep "\-17" | awk -F '/' '{print $3}'` ; do > > ps -p $x --no-headers ; done > > grep: /proc/1411/task/1411/oom_adj: No such file or directory > > grep: /proc/1411/oom_adj: No such file or directory > > 804 ? 00:00:00 udevd > > 804 ? 00:00:00 udevd > > 25536 ? 00:00:00 sshd > > 25536 ? 00:00:00 sshd > > 31861 ? 00:00:00 sshd > > 31861 ? 00:00:00 sshd > > 32173 ? 00:00:00 udevd > > 32173 ? 00:00:00 udevd > > 32174 ? 00:00:00 udevd > > 32174 ? 00:00:00 udevd > > > > (the below is the newer/broken kernel) > > root@vicky [/mnt/encstore/ssd/kernel/linux-2.6.32.41] > uname -a > > Linux vicky 2.6.32.41-grsec #3 SMP Mon May 30 02:34:43 BST 2011 x86_64 > > GNU/Linux > > root@vicky [/mnt/encstore/ssd/kernel/linux-2.6.32.41] > for x in > > `find /proc -iname 'oom_adj' | xargs grep "\-17" | awk -F '/' '{print > > $3}'` ; do ps -p $x --no-headers ; done > > grep: /proc/3118/task/3118/oom_adj: No such file or directory > > grep: /proc/3118/oom_adj: No such file or directory > > 895 ? 00:00:00 udevd > > 895 ? 00:00:00 udevd > > 1091 ? 00:00:00 udevd > > 1091 ? 00:00:00 udevd > > 1092 ? 00:00:00 udevd > > 1092 ? 00:00:00 udevd > > 2596 ? 00:00:00 sshd > > 2596 ? 00:00:00 sshd > > 2608 ? 00:00:00 sshd > > 2608 ? 00:00:00 sshd > > 2613 ? 00:00:00 sshd > > 2613 ? 00:00:00 sshd > > 2614 pts/0 00:00:00 bash > > 2614 pts/0 00:00:00 bash > > 2620 pts/0 00:00:00 sudo > > 2620 pts/0 00:00:00 sudo > > 2621 pts/0 00:00:00 su > > 2621 pts/0 00:00:00 su > > 2622 pts/0 00:00:00 bash > > 2622 pts/0 00:00:00 bash > > 2685 ? 00:00:00 lxc-start > > 2685 ? 00:00:00 lxc-start > > 2699 ? 00:00:00 init > > 2699 ? 00:00:00 init > > 2939 ? 00:00:00 rc > > 2939 ? 00:00:00 rc > > 2942 ? 00:00:00 startpar > > 2942 ? 00:00:00 startpar > > 2964 ? 00:00:00 rsyslogd > > 2964 ? 00:00:00 rsyslogd > > 2964 ? 00:00:00 rsyslogd > > 2964 ? 00:00:00 rsyslogd > > 2980 ? 00:00:00 startpar > > 2980 ? 00:00:00 startpar > > 2981 ? 00:00:00 ctlscript.sh > > 2981 ? 00:00:00 ctlscript.sh > > 3016 ? 00:00:00 cron > > 3016 ? 00:00:00 cron > > 3025 ? 00:00:00 mysqld_safe > > 3025 ? 00:00:00 mysqld_safe > > 3032 ? 00:00:00 sshd > > 3032 ? 00:00:00 sshd > > 3097 ? 00:00:00 mysqld.bin > > 3097 ? 00:00:00 mysqld.bin > > 3097 ? 00:00:00 mysqld.bin > > 3097 ? 00:00:00 mysqld.bin > > 3097 ? 00:00:00 mysqld.bin > > 3097 ? 00:00:00 mysqld.bin > > 3097 ? 00:00:00 mysqld.bin > > 3097 ? 00:00:00 mysqld.bin > > 3097 ? 00:00:00 mysqld.bin > > 3097 ? 00:00:00 mysqld.bin > > 3113 ? 00:00:00 ctl.sh > > 3113 ? 00:00:00 ctl.sh > > 3115 ? 00:00:00 sleep > > 3115 ? 00:00:00 sleep > > 3116 ? 00:00:00 .memcached.bin > > 3116 ? 00:00:00 .memcached.bin > > > > > > As you can see, it is clear that the newer kernel is setting -17 by > > default, which in turn is causing the OOM killer loop. > > > > So I began to try and find what may have caused this problem by > > comparing the two sources... > > > > I checked the code for all references to 'oom_adj' and 'oom_adjust' in > > both code sets, but found no obvious differences: > > grep -R -e oom_adjust -e oom_adj . | sort | grep -R -e oom_adjust -e > > oom_adj > > > > Then I checked for references to "-17" in all .c and .h files, and > > found a couple of matches, but only one obvious one: > > grep -R "\-17" . | grep -e ".c:" -e ".h:" -e "\-17" | wc -l > > ./include/linux/oom.h:#define OOM_DISABLE (-17) > > > > But again, a search for OOM_DISABLE came up with nothing obvious... > > > > In a last ditch attempt, I did a search for all references to 'oom' > > (case-insensitive) in both code bases, then compared the two: > > root@annabelle [~/lol/linux-2.6.32.28] > grep -i -R "oom" . | sort -n > > > /tmp/annabelle.oom_adj > > root@vicky [/mnt/encstore/ssd/kernel/linux-2.6.32.41] > grep -i -R > > "oom" . | sort -n > /tmp/vicky.oom_adj > > > > and this brought back (yet again) nothing obvious.. > > > > > > root@vicky [/mnt/encstore/ssd/kernel/linux-2.6.32.41] > md5sum > > ./include/linux/oom.h > > 2a32622f6cd38299fc2801d10a9a3ea8 ./include/linux/oom.h > > > > root@annabelle [~/lol/linux-2.6.32.28] > md5sum ./include/linux/oom.h > > 2a32622f6cd38299fc2801d10a9a3ea8 ./include/linux/oom.h > > > > root@vicky [/mnt/encstore/ssd/kernel/linux-2.6.32.41] > md5sum > > ./mm/oom_kill.c > > 1ef2c2bec19868d13ec66ec22033f10a ./mm/oom_kill.c > > > > root@annabelle [~/lol/linux-2.6.32.28] > md5sum ./mm/oom_kill.c > > 1ef2c2bec19868d13ec66ec22033f10a ./mm/oom_kill.c > > > > > > > > Could anyone please shed some light as to why the default oom_adj is > > set to -17 now (and where it is actually set)? From what I can tell, > > the fix for this issue will either be: > > > > 1. Allow OOM killer to override the decision of ignoring oom_adj == > > -17 if an unrecoverable loop is encountered. > > 2. Change the default back to 0. > > > > Again, my apologies if this bug report is slightly unorthodox, or > > doesn't follow usual procedure etc. I can assure you I have tried my > > absolute best to give all the necessary information though. > > > > Cal > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>