On Mon 06-11-17 13:12:22, Michal Hocko wrote: > On Mon 06-11-17 13:00:25, Peter Zijlstra wrote: > > On Mon, Nov 06, 2017 at 11:43:54AM +0100, Michal Hocko wrote: > > > > Yes the comment is very much accurate. > > > > > > Which suggests that print_vma_addr might be problematic, right? > > > Shouldn't we do trylock on mmap_sem instead? > > > > Yes that's complete rubbish. trylock will get spurious failures to print > > when the lock is contended. > > Yes, but I guess that it is acceptable to to not print the state under > that condition. So what do you think about this? I think this is more robust than playing tricks with the explicit preempt count checks and less tedious than checking to make it conditional on the context. This is on top of Linus tree and if accepted it should replace the patch discussed here. --- >From 0de6d57cbc54ee2686d1f1e4ffcc4ed490ded8aa Mon Sep 17 00:00:00 2001 From: Michal Hocko <mhocko@xxxxxxxx> Date: Mon, 6 Nov 2017 14:31:20 +0100 Subject: [PATCH] mm: do not rely on preempt_count in print_vma_addr The preempt count check on print_vma_addr has been added by e8bff74afbdb ("x86: fix "BUG: sleeping function called from invalid context" in print_vma_addr()") and it relied on the elevated preempt count from preempt_conditional_sti because preempt_count check doesn't work on non preemptive kernels by default. The code has evolved though and d99e1bd175f4 ("x86/entry/traps: Refactor preemption and interrupt flag handling") has replaced preempt_conditional_sti by an explicit preempt_disable which is noop on !PREEMPT so the check in print_vma_addr is broken. Fix the issue by using trylock on mmap_sem rather than chacking the preempt count. The allocation we are relying on has to be GFP_NOWAIT as well. There is a chance that we won't dump the vma state if the lock is contended or the memory short but this is acceptable outcome and much less fragile than the not working preemption check or tricks around it. Fixes: d99e1bd175f4 ("x86/entry/traps: Refactor preemption and interrupt flag handling") Signed-off-by: Michal Hocko <mhocko@xxxxxxxx> --- mm/memory.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index a728bed16c20..1e308ac8ca0a 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4457,17 +4457,15 @@ void print_vma_addr(char *prefix, unsigned long ip) struct vm_area_struct *vma; /* - * Do not print if we are in atomic - * contexts (in exception stacks, etc.): + * we might be running from an atomic context so we cannot sleep */ - if (preempt_count()) + if (!down_read_trylock(&mm->mmap_sem)) return; - down_read(&mm->mmap_sem); vma = find_vma(mm, ip); if (vma && vma->vm_file) { struct file *f = vma->vm_file; - char *buf = (char *)__get_free_page(GFP_KERNEL); + char *buf = (char *)__get_free_page(GFP_NOWAIT); if (buf) { char *p; -- 2.14.2 -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>