On Sun 19-06-16 12:06:53, Tetsuo Handa wrote: > On 2016/06/16 18:39, Michal Hocko wrote: > > On Wed 15-06-16 12:50:43, Sasha Levin wrote: > >> Hi all, > >> > >> I'm seeing the following NULL ptr deref in copy_process right after a bunch > >> of OOM killing activity on -next kernels: > >> > >> Out of memory (oom_kill_allocating_task): Kill process 3477 (trinity-c159) score 0 or sacrifice child > >> Killed process 3477 (trinity-c159) total-vm:3226820kB, anon-rss:36832kB, file-rss:1640kB, shmem-rss:444kB > >> oom_reaper: reaped process 3477 (trinity-c159), now anon-rss:0kB, file-rss:0kB, shmem-rss:444kB > >> Out of memory (oom_kill_allocating_task): Kill process 3450 (trinity-c156) score 0 or sacrifice child > >> Killed process 3450 (trinity-c156) total-vm:3769768kB, anon-rss:36832kB, file-rss:1652kB, shmem-rss:508kB > >> oom_reaper: reaped process 3450 (trinity-c156), now anon-rss:0kB, file-rss:0kB, shmem-rss:572kB > >> BUG: unable to handle kernel NULL pointer dereference at 0000000000000150 > >> IP: copy_process (./arch/x86/include/asm/atomic.h:103 kernel/fork.c:484 kernel/fork.c:964 kernel/fork.c:1018 kernel/fork.c:1484) > >> PGD 1ff944067 PUD 1ff929067 PMD 0 > >> Oops: 0002 [#1] PREEMPT SMP KASAN > >> Modules linked in: > >> CPU: 18 PID: 8761 Comm: trinity-main Not tainted 4.7.0-rc3-sasha-02101-g1e1b9fa #3108 > > > > Is this a common parent of the oom killed children? > > > >> task: ffff880165564000 ti: ffff880337ad0000 task.ti: ffff880337ad0000 > >> RIP: copy_process (./arch/x86/include/asm/atomic.h:103 kernel/fork.c:484 kernel/fork.c:964 kernel/fork.c:1018 kernel/fork.c:1484) > > > > IIUC this should be: > > _do_fork > > copy_process > > copy_mm > > dup_mm > > dup_mmap > > if (tmp->vm_flags & VM_DENYWRITE) > > atomic_dec(&inode->i_writecount); > > > > I am not really sure how f->f_inode can become NULL when file should pin > > the inode AFAIR, and VMA should pin the file. Anyway this shouldn't be > > directly related to the OOM killer or at least the recent changes > > in that area because the oom reaper doesn't touch VMAs file. > > These OOM messages say that oom_kill_allocating_task != 0 is used. > That is, a __GFP_FS allocation by a child process which is trying to > duplicate the parent's mm_struct was killed by the OOM killer and > reaped by the OOM reaper. I guess that mmap related stuff are not > fully initialized (or consistent) yet while the OOM reaper assumed > that it is safe to access such child's mmap related stuff. I will double check but the oom_reaper only unmaps VMAs. We are not deleting or modifying the VMA layout or disassociate VMAs from their files. So I do not see how this could be related. > So, if this bug is reproducible (I thing it is), first try to reproduce > this bug without the OOM reaper enabled (i.e. comment out the Yes, that would be definitely good to test. > > subsys_initcall(oom_init) > > line in mm/oom_kill.c ). -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>