>Hello azur, > >On Mon, Sep 02, 2013 at 12:38:02PM +0200, azurIt wrote: >> >>Hi azur, >> >> >> >>here is the x86-only rollup of the series for 3.2. >> >> >> >>Thanks! >> >>Johannes >> >>--- >> > >> > >> >Johannes, >> > >> >unfortunately, one problem arises: I have (again) cgroup which cannot be deleted :( it's a user who had very high memory usage and was reaching his limit very often. Do you need any info which i can gather now? > >Did the OOM killer go off in this group? > >Was there a warning in the syslog ("Fixing unhandled memcg OOM >context")? > >If it happens again, could you check if there are tasks left in the >cgroup? And provide /proc/<pid>/stack of the hung task trying to >delete the cgroup? > >> Now i can definitely confirm that problem is NOT fixed :( it happened again but i don't have any data because i already disabled all debug output. > >Which debug output? > >Do you still have access to the syslog? > >It's possible that, as your system does not deadlock on the OOMing >cgroup anymore, you hit a separate bug... > >Thanks! My script has just detected (and killed) another freezed cgroup. I must say that i'm not 100% sure that cgroup was really freezed but it has 99% or more memory usage for at least 30 seconds (well, or it has 99% memory usage in both two cases the script was checking it). Here are stacks of processes inside it before they were killed: pid: 26490 stack: [<ffffffff81127842>] do_last+0x302/0xa60 [<ffffffff81128077>] path_openat+0xd7/0x470 [<ffffffff81128529>] do_filp_open+0x49/0xa0 [<ffffffff81114a16>] do_sys_open+0x106/0x240 [<ffffffff81114b90>] sys_open+0x20/0x30 [<ffffffff815cbce6>] system_call_fastpath+0x18/0x1d [<ffffffffffffffff>] 0xffffffffffffffff pid: 26503 stack: [<ffffffff81127842>] do_last+0x302/0xa60 [<ffffffff81128077>] path_openat+0xd7/0x470 [<ffffffff81128529>] do_filp_open+0x49/0xa0 [<ffffffff81114a16>] do_sys_open+0x106/0x240 [<ffffffff81114b90>] sys_open+0x20/0x30 [<ffffffff815cbce6>] system_call_fastpath+0x18/0x1d [<ffffffffffffffff>] 0xffffffffffffffff pid: 26517 stack: [<ffffffff81127842>] do_last+0x302/0xa60 [<ffffffff81128077>] path_openat+0xd7/0x470 [<ffffffff81128529>] do_filp_open+0x49/0xa0 [<ffffffff81114a16>] do_sys_open+0x106/0x240 [<ffffffff81114b90>] sys_open+0x20/0x30 [<ffffffff815cbce6>] system_call_fastpath+0x18/0x1d [<ffffffffffffffff>] 0xffffffffffffffff pid: 26518 stack: [<ffffffff81127842>] do_last+0x302/0xa60 [<ffffffff81128077>] path_openat+0xd7/0x470 [<ffffffff81128529>] do_filp_open+0x49/0xa0 [<ffffffff81114a16>] do_sys_open+0x106/0x240 [<ffffffff81114b90>] sys_open+0x20/0x30 [<ffffffff815cbce6>] system_call_fastpath+0x18/0x1d [<ffffffffffffffff>] 0xffffffffffffffff pid: 26519 stack: [<ffffffff815cb618>] retint_careful+0xd/0x1a [<ffffffffffffffff>] 0xffffffffffffffff pid: 26520 stack: [<ffffffff81127842>] do_last+0x302/0xa60 [<ffffffff81128077>] path_openat+0xd7/0x470 [<ffffffff81128529>] do_filp_open+0x49/0xa0 [<ffffffff81114a16>] do_sys_open+0x106/0x240 [<ffffffff81114b90>] sys_open+0x20/0x30 [<ffffffff815cbce6>] system_call_fastpath+0x18/0x1d [<ffffffffffffffff>] 0xffffffffffffffff pid: 26521 stack: [<ffffffff815cb618>] retint_careful+0xd/0x1a [<ffffffffffffffff>] 0xffffffffffffffff pid: 26522 stack: [<ffffffff81127842>] do_last+0x302/0xa60 [<ffffffff81128077>] path_openat+0xd7/0x470 [<ffffffff81128529>] do_filp_open+0x49/0xa0 [<ffffffff81114a16>] do_sys_open+0x106/0x240 [<ffffffff81114b90>] sys_open+0x20/0x30 [<ffffffff815cbce6>] system_call_fastpath+0x18/0x1d [<ffffffffffffffff>] 0xffffffffffffffff pid: 26523 stack: [<ffffffff81127842>] do_last+0x302/0xa60 [<ffffffff81128077>] path_openat+0xd7/0x470 [<ffffffff81128529>] do_filp_open+0x49/0xa0 [<ffffffff81114a16>] do_sys_open+0x106/0x240 [<ffffffff81114b90>] sys_open+0x20/0x30 [<ffffffff815cbce6>] system_call_fastpath+0x18/0x1d [<ffffffffffffffff>] 0xffffffffffffffff pid: 26524 stack: [<ffffffff81052671>] sys_sched_yield+0x41/0x70 [<ffffffff81148d91>] free_more_memory+0x21/0x60 [<ffffffff8114941d>] __getblk+0x14d/0x2c0 [<ffffffff8119888b>] ext3_getblk+0xeb/0x240 [<ffffffff811989f9>] ext3_bread+0x19/0x90 [<ffffffff8119cea3>] ext3_dx_find_entry+0x83/0x1e0 [<ffffffff8119d2e4>] ext3_find_entry+0x2e4/0x480 [<ffffffff8119dbcd>] ext3_lookup+0x4d/0x120 [<ffffffff811228f5>] d_alloc_and_lookup+0x45/0x90 [<ffffffff81125578>] __lookup_hash+0xa8/0xf0 [<ffffffff81127852>] do_last+0x312/0xa60 [<ffffffff81128077>] path_openat+0xd7/0x470 [<ffffffff81128529>] do_filp_open+0x49/0xa0 [<ffffffff81114a16>] do_sys_open+0x106/0x240 [<ffffffff81114b90>] sys_open+0x20/0x30 [<ffffffff815cbce6>] system_call_fastpath+0x18/0x1d [<ffffffffffffffff>] 0xffffffffffffffff pid: 26526 stack: [<ffffffffffffffff>] 0xffffffffffffffff pid: 26531 stack: [<ffffffff81127842>] do_last+0x302/0xa60 [<ffffffff81128077>] path_openat+0xd7/0x470 [<ffffffff81128529>] do_filp_open+0x49/0xa0 [<ffffffff81114a16>] do_sys_open+0x106/0x240 [<ffffffff81114b90>] sys_open+0x20/0x30 [<ffffffff815cbce6>] system_call_fastpath+0x18/0x1d [<ffffffffffffffff>] 0xffffffffffffffff pid: 26533 stack: [<ffffffff815cb618>] retint_careful+0xd/0x1a [<ffffffffffffffff>] 0xffffffffffffffff pid: 26536 stack: [<ffffffff81080a45>] refrigerator+0x95/0x160 [<ffffffff8106ac2b>] get_signal_to_deliver+0x1cb/0x540 [<ffffffff8100188b>] do_signal+0x6b/0x750 [<ffffffff81001fc5>] do_notify_resume+0x55/0x80 [<ffffffff815cb662>] retint_signal+0x3d/0x7b [<ffffffffffffffff>] 0xffffffffffffffff pid: 26539 stack: [<ffffffff815cb618>] retint_careful+0xd/0x1a [<ffffffffffffffff>] 0xffffffffffffffff -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html