Re: [PATCH] mm/vmscan: fix data races at kswapd_classzone_idx

Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> · Tue, 25 Feb 2020 18:11:01 -0800

On Tue, 25 Feb 2020 11:55:26 -0500 Qian Cai <cai@xxxxxx> wrote:

> pgdat->kswapd_classzone_idx could be accessed concurrently in
> wakeup_kswapd(). Plain writes and reads without any lock protection
> result in data races. Fix them by adding a pair of READ|WRITE_ONCE() as
> well as saving a branch (compilers might well optimize the original code
> in an unintentional way anyway). The data races were reported by KCSAN,
> 
> ...
>
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -3961,11 +3961,10 @@ void wakeup_kswapd(struct zone *zone, gfp_t gfp_flags, int order,
>  		return;
>  	pgdat = zone->zone_pgdat;
>  
> -	if (pgdat->kswapd_classzone_idx == MAX_NR_ZONES)
> -		pgdat->kswapd_classzone_idx = classzone_idx;
> -	else
> -		pgdat->kswapd_classzone_idx = max(pgdat->kswapd_classzone_idx,
> -						  classzone_idx);
> +	if (READ_ONCE(pgdat->kswapd_classzone_idx) == MAX_NR_ZONES ||
> +	    READ_ONCE(pgdat->kswapd_classzone_idx) < classzone_idx)
> +		WRITE_ONCE(pgdat->kswapd_classzone_idx, classzone_idx);
> +
>  	pgdat->kswapd_order = max(pgdat->kswapd_order, order);
>  	if (!waitqueue_active(&pgdat->kswapd_wait))
>  		return;

This is very partial, isn't it?  The above code itself is racy against
other code which manipulates ->kswapd_classzone_idx and the
manipulation in allow_direct_reclaim() is performed by threads other
than kswapd and so need the READ_ONCE treatment and is still racy with
that?

I guess occasional races here don't really matter, but a grossly wrong
read from load tearing might matter.  In which case shouldn't we be
defending against them in all cases where non-kswapd threads read this
field?