On Fri, Oct 25, 2024 at 10:22:08AM +0800, Shawn Wang wrote: > When running stress-ng-vm-segv test, we found a null pointer dereference > error in task_numa_work(). Here is the backtrace: > > [323676.066985] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020 > ...... > [323676.067108] CPU: 35 PID: 2694524 Comm: stress-ng-vm-se > ...... > [323676.067113] pstate: 23401009 (nzCv daif +PAN -UAO +TCO +DIT +SSBS BTYPE=--) > [323676.067115] pc : vma_migratable+0x1c/0xd0 > [323676.067122] lr : task_numa_work+0x1ec/0x4e0 > [323676.067127] sp : ffff8000ada73d20 > [323676.067128] x29: ffff8000ada73d20 x28: 0000000000000000 x27: 000000003e89f010 > [323676.067130] x26: 0000000000080000 x25: ffff800081b5c0d8 x24: ffff800081b27000 > [323676.067133] x23: 0000000000010000 x22: 0000000104d18cc0 x21: ffff0009f7158000 > [323676.067135] x20: 0000000000000000 x19: 0000000000000000 x18: ffff8000ada73db8 > [323676.067138] x17: 0001400000000000 x16: ffff800080df40b0 x15: 0000000000000035 > [323676.067140] x14: ffff8000ada73cc8 x13: 1fffe0017cc72001 x12: ffff8000ada73cc8 > [323676.067142] x11: ffff80008001160c x10: ffff000be639000c x9 : ffff8000800f4ba4 > [323676.067145] x8 : ffff000810375000 x7 : ffff8000ada73974 x6 : 0000000000000001 > [323676.067147] x5 : 0068000b33e26707 x4 : 0000000000000001 x3 : ffff0009f7158000 > [323676.067149] x2 : 0000000000000041 x1 : 0000000000004400 x0 : 0000000000000000 > [323676.067152] Call trace: > [323676.067153] vma_migratable+0x1c/0xd0 > [323676.067155] task_numa_work+0x1ec/0x4e0 > [323676.067157] task_work_run+0x78/0xd8 > [323676.067161] do_notify_resume+0x1ec/0x290 > [323676.067163] el0_svc+0x150/0x160 > [323676.067167] el0t_64_sync_handler+0xf8/0x128 > [323676.067170] el0t_64_sync+0x17c/0x180 > [323676.067173] Code: d2888001 910003fd f9000bf3 aa0003f3 (f9401000) > [323676.067177] SMP: stopping secondary CPUs > [323676.070184] Starting crashdump kernel... > > stress-ng-vm-segv in stress-ng is used to stress test the SIGSEGV error > handling function of the system, which tries to cause a SIGSEGV error on > return from unmapping the whole address space of the child process. > > Normally this program will not cause kernel crashes. But before the > munmap system call returns to user mode, a potential task_numa_work() > for numa balancing could be added and executed. In this scenario, since the > child process has no vma after munmap, the vma_next() in task_numa_work() > will return a null pointer even if the vma iterator restarts from 0. > > Recheck the vma pointer before dereferencing it in task_numa_work(). > > Fixes: 214dbc428137 ("sched: convert to vma iterator") > Cc: stable@xxxxxxxxxxxxxxx # v6.2+ Thanks