Re: [PATCH 1/2] mm: fix null-ptr-deref in kswapd_is_running()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 24.08.22 09:19, Kefeng Wang wrote:
> The kswapd_run/stop() will set pgdat->kswapd to NULL, which
> could race with kswapd_is_running() in kcompactd(),
> 
> kswapd_run/stop()	kcompactd()
> 			  kswapd_is_running()
> 				if (pgdat->kswapd) // load non-NULL pgdat->kswapd
>   pgdat->kswapd = NULL
> 				task_is_running(pgdat->kswapd) // Null pointer derefence
> 
> The KASAN report the null-ptr-deref shown below,
> 
>   vmscan: Failed to start kswapd on node 0
>   ...
>   BUG: KASAN: null-ptr-deref in kcompactd+0x440/0x504
>   Read of size 8 at addr 0000000000000024 by task kcompactd0/37
> 
>   CPU: 0 PID: 37 Comm: kcompactd0 Kdump: loaded Tainted: G           OE     5.10.60 #1
>   Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
>   Call trace:
>    dump_backtrace+0x0/0x394
>    show_stack+0x34/0x4c
>    dump_stack+0x158/0x1e4
>    __kasan_report+0x138/0x140
>    kasan_report+0x44/0xdc
>    __asan_load8+0x94/0xd0
>    kcompactd+0x440/0x504
>    kthread+0x1a4/0x1f0
>    ret_from_fork+0x10/0x18
> 
> For race between kswapd_run() and kcompactd(), adding a temporary value
> when create a kthread, and only set it to pgdat->kswapd if kthread_run()
> return successful task_struct to fix the issue.
> 
> For race between kswapd_stop() and kcompactd(), let's call kcompactd_stop()
> before kswapd_stop() to fix the issue.
> 
> Signed-off-by: Kefeng Wang <wangkefeng.wang@xxxxxxxxxx>
> ---
>  mm/memory_hotplug.c | 2 +-
>  mm/vmscan.c         | 8 +++++---
>  2 files changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index fad6d1f2262a..2fd45ccbce45 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1940,8 +1940,8 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages,
>  
>  	node_states_clear_node(node, &arg);
>  	if (arg.status_change_nid >= 0) {
> -		kswapd_stop(node);
>  		kcompactd_stop(node);
> +		kswapd_stop(node);
>  	}

This looks just fragile to randomly break again in the future when
people work on this code without being aware of this condition. Or once
with other (future?) kswapd_is_running() users. We at least need some
comment explaining that the order here matters and why.

But I do wonder if we can't handle it in a cleaner, more obvious, way.

kswapd_start()/kswapd_stop() should have a proper way to synchronize
with kswapd_is_running(). Just the matter of finding a suitable locking
primitive :)

-- 
Thanks,

David / dhildenb





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux