From: Shuai Xue <xueshuai@xxxxxxxxxxxxxxxxx> commit 77677cdbc2aa4b5d5d839562793d3d126201d18d upstream. The GHES code calls memory_failure_queue() from IRQ context to queue work into workqueue and schedule it on the current CPU. Then the work is processed in memory_failure_work_func() by kworker and calls memory_failure(). When a page is already poisoned, commit a3f5d80ea401 ("mm,hwpoison: send SIGBUS with error virutal address") make memory_failure() call kill_accessing_process() that: - holds mmap locking of current->mm - does pagetable walk to find the error virtual address - and sends SIGBUS to the current process with error info. However, the mm of kworker is not valid, resulting in a null-pointer dereference. So check mm when killing the accessing process. [akpm@xxxxxxxxxxxxxxxxxxxx: remove unrelated whitespace alteration] Link: https://lkml.kernel.org/r/20220914064935.7851-1-xueshuai@xxxxxxxxxxxxxxxxx Fixes: a3f5d80ea401 ("mm,hwpoison: send SIGBUS with error virutal address") Signed-off-by: Shuai Xue <xueshuai@xxxxxxxxxxxxxxxxx> Reviewed-by: Miaohe Lin <linmiaohe@xxxxxxxxxx> Acked-by: Naoya Horiguchi <naoya.horiguchi@xxxxxxx> Cc: Huang Ying <ying.huang@xxxxxxxxx> Cc: Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx> Cc: Bixuan Cui <cuibixuan@xxxxxxxxxxxxxxxxx> Cc: <stable@xxxxxxxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- mm/memory-failure.c | 3 +++ 1 file changed, 3 insertions(+) --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -697,6 +697,9 @@ static int kill_accessing_process(struct }; priv.tk.tsk = p; + if (!p->mm) + return -EFAULT; + mmap_read_lock(p->mm); ret = walk_page_range(p->mm, 0, TASK_SIZE, &hwp_walk_ops, (void *)&priv);