On Mon 24-06-13 16:13:45, Johannes Weiner wrote: > Hi guys, > > On Sat, Jun 22, 2013 at 10:09:58PM +0200, azurIt wrote: > > >> But i'm sure of one thing - when problem occurs, nothing is able to > > >> access hard drives (every process which tries it is freezed until > > >> problem is resolved or server is rebooted). > > > > > >I would be really interesting to see what those tasks are blocked on. > > > > I'm trying to get it, stay tuned :) > > > > Today i noticed one bug, not 100% sure it is related to 'your' patch > > but i didn't seen this before. I noticed that i have lots of cgroups > > which cannot be removed - if i do 'rmdir <cgroup_directory>', it > > just hangs and never complete. Even more, it's not possible to > > access the whole cgroup filesystem until i kill that rmdir > > (anything, which tries it, just hangs). All unremoveable cgroups has > > this in 'memory.oom_control': oom_kill_disable 0 under_oom 1 > > Somebody acquires the OOM wait reference to the memcg and marks it > under oom but then does not call into mem_cgroup_oom_synchronize() to > clean up. That's why under_oom is set and the rmdir waits for > outstanding references. > > > And, yes, 'tasks' file is empty. > > It's not a kernel thread that does it because all kernel-context > handle_mm_fault() are annotated properly, which means the task must be > userspace and, since tasks is empty, have exited before synchronizing. Yes, well spotted. I have missed that while reviewing your patch. The follow up fix looks correct. > Can you try with the following patch on top? > > diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c > index 5db0490..9a0b152 100644 > --- a/arch/x86/mm/fault.c > +++ b/arch/x86/mm/fault.c > @@ -846,17 +846,6 @@ static noinline int > mm_fault_error(struct pt_regs *regs, unsigned long error_code, > unsigned long address, unsigned int fault) > { > - /* > - * Pagefault was interrupted by SIGKILL. We have no reason to > - * continue pagefault. > - */ > - if (fatal_signal_pending(current)) { > - if (!(fault & VM_FAULT_RETRY)) > - up_read(¤t->mm->mmap_sem); > - if (!(error_code & PF_USER)) > - no_context(regs, error_code, address); > - return 1; > - } > if (!(fault & VM_FAULT_ERROR)) > return 0; > -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>