______________________________________________________________ > Od: Johannes Weiner <hannes@xxxxxxxxxxx> > Komu: azurIt <azurit@xxxxxxxx> > Dátum: 17.09.2013 02:02 > Predmet: Re: [patch 0/7] improve memcg oom killer robustness v2 > > CC: "Michal Hocko" <mhocko@xxxxxxx>, "Andrew Morton" <akpm@xxxxxxxxxxxxxxxxxxxx>, "David Rientjes" <rientjes@xxxxxxxxxx>, "KAMEZAWA Hiroyuki" <kamezawa.hiroyu@xxxxxxxxxxxxxx>, "KOSAKI Motohiro" <kosaki.motohiro@xxxxxxxxxxxxxx>, linux-mm@xxxxxxxxx, cgroups@xxxxxxxxxxxxxxx, x86@xxxxxxxxxx, linux-arch@xxxxxxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx >On Mon, Sep 16, 2013 at 10:52:46PM +0200, azurIt wrote: >> > CC: "Johannes Weiner" <hannes@xxxxxxxxxxx>, "Andrew Morton" <akpm@xxxxxxxxxxxxxxxxxxxx>, "David Rientjes" <rientjes@xxxxxxxxxx>, "KAMEZAWA Hiroyuki" <kamezawa.hiroyu@xxxxxxxxxxxxxx>, "KOSAKI Motohiro" <kosaki.motohiro@xxxxxxxxxxxxxx>, linux-mm@xxxxxxxxx, cgroups@xxxxxxxxxxxxxxx, x86@xxxxxxxxxx, linux-arch@xxxxxxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx >> >On Mon 16-09-13 17:05:43, azurIt wrote: >> >> > CC: "Johannes Weiner" <hannes@xxxxxxxxxxx>, "Andrew Morton" <akpm@xxxxxxxxxxxxxxxxxxxx>, "David Rientjes" <rientjes@xxxxxxxxxx>, "KAMEZAWA Hiroyuki" <kamezawa.hiroyu@xxxxxxxxxxxxxx>, "KOSAKI Motohiro" <kosaki.motohiro@xxxxxxxxxxxxxx>, linux-mm@xxxxxxxxx, cgroups@xxxxxxxxxxxxxxx, x86@xxxxxxxxxx, linux-arch@xxxxxxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx >> >> >On Mon 16-09-13 16:13:16, azurIt wrote: >> >> >[...] >> >> >> >You can use sysrq+l via serial console to see tasks hogging the CPU or >> >> >> >sysrq+t to see all the existing tasks. >> >> >> >> >> >> >> >> >> Doesn't work here, it just prints 'l' resp. 't'. >> >> > >> >> >I am using telnet for accessing my serial consoles exported by >> >> >the multiplicator or KVM and it can send sysrq via ctrl+t (Send >> >> >Break). Check your serial console setup. >> >> >> >> >> >> >> >> I'm using Raritan KVM and i created keyboard macro 'sysrq + l' resp. >> >> 'sysrq + t'. I'm also unable to use it on my local PC. Maybe it needs >> >> to be enabled somehow? >> > >> >Probably yes. echo 1 > /proc/sys/kernel/sysrq should enable all sysrq >> >commands. You can select also some of them (have a look at >> >Documentation/sysrq.txt for more information) >> >> >> Now it happens again and i was just looking on the server's >> htop. I'm sure that this time it was only one process (apache) >> running under user account (not root). It was taking about 100% CPU >> (about 100% of one core). I was able to kill it by hand inside htop >> but everything was very slow, server load was immediately on >> 500. I'm sure it must be related to that Johannes kernel patches >> because i'm also using i/o throttling in cgroups via Block IO >> controller so users are unable to create such a huge I/O. I will try >> to take stacks of processes but i'm not able to identify the >> problematic process so i will have to take them from *all* apache >> processes while killing them. > >It would be fantastic if you could capture those stacks. sysrq+t >captures ALL of them in one go and drops them into your syslog. > >/proc/<pid>/stack for individual tasks works too. Is something unusual on this stack? [<ffffffff810d1a5e>] dump_header+0x7e/0x1e0 [<ffffffff810d195f>] ? find_lock_task_mm+0x2f/0x70 [<ffffffff810d1f25>] oom_kill_process+0x85/0x2a0 [<ffffffff810d24a8>] mem_cgroup_out_of_memory+0xa8/0xf0 [<ffffffff8110fb76>] mem_cgroup_oom_synchronize+0x2e6/0x310 [<ffffffff8110efc0>] ? mem_cgroup_uncharge_page+0x40/0x40 [<ffffffff810d2703>] pagefault_out_of_memory+0x13/0x130 [<ffffffff81026f6e>] mm_fault_error+0x9e/0x150 [<ffffffff81027424>] do_page_fault+0x404/0x490 [<ffffffff810f952c>] ? do_mmap_pgoff+0x3dc/0x430 [<ffffffff815cb87f>] page_fault+0x1f/0x30 Problem happens again but my script was unable to get stacks. I was able to see processes which were doing problems (two this time) and i have their PIDs. The stack above is from different process but from the same cgroup (memcg OOM killed it and prints it's stack into syslog). azur -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html