The PGDAT_RECLAIM_LOCKED bit is used to provide mutual exclusion of node reclaim for struct pglist_data using a single bit. It is "locked" with a test_and_set_bit (similarly to a try lock) which provides full ordering with respect to loads and stores done within __node_reclaim(). It is "unlocked" with clear_bit(), which does not provide any ordering with respect to loads and stores done before clearing the bit. The lack of clear_bit() memory ordering with respect to stores within __node_reclaim() can cause a subsequent CPU to fail to observe stores from a prior node reclaim. This is not an issue in practice on TSO (e.g. x86), but it is an issue on weakly-ordered architectures (e.g. arm64). Fix this with following changes: A) Use clear_bit_unlock rather than clear_bit to clear PGDAT_RECLAIM_LOCKED with a release memory ordering semantic. This provides stronger memory ordering (release rather than relaxed). B) Use test_and_set_bit_lock rather than test_and_set_bit to test-and-set PGDAT_RECLAIM_LOCKED with an acquire memory ordering semantic. This changes the "lock" acquisition from a full barrier to an acquire memory ordering, which is weaker. The acquire semi-permeable barrier paired with the release on unlock is sufficient for this mutual exclusion use-case. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> Cc: Lorenzo Stoakes <lorenzo.stoakes@xxxxxxxxxx> Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx> Cc: Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> Cc: Andrea Parri <parri.andrea@xxxxxxxxx> Cc: Will Deacon <will@xxxxxxxxxx> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx> Cc: Boqun Feng <boqun.feng@xxxxxxxxx> Cc: Nicholas Piggin <npiggin@xxxxxxxxx> Cc: David Howells <dhowells@xxxxxxxxxx> Cc: Jade Alglave <j.alglave@xxxxxxxxx> Cc: Luc Maranget <luc.maranget@xxxxxxxx> Cc: "Paul E. McKenney" <paulmck@xxxxxxxxxx> Cc: linux-mm@xxxxxxxxx --- mm/vmscan.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index c22175120f5d..021b25bdba91 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -7567,11 +7567,11 @@ int node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned int order) if (node_state(pgdat->node_id, N_CPU) && pgdat->node_id != numa_node_id()) return NODE_RECLAIM_NOSCAN; - if (test_and_set_bit(PGDAT_RECLAIM_LOCKED, &pgdat->flags)) + if (test_and_set_bit_lock(PGDAT_RECLAIM_LOCKED, &pgdat->flags)) return NODE_RECLAIM_NOSCAN; ret = __node_reclaim(pgdat, gfp_mask, order); - clear_bit(PGDAT_RECLAIM_LOCKED, &pgdat->flags); + clear_bit_unlock(PGDAT_RECLAIM_LOCKED, &pgdat->flags); if (ret) count_vm_event(PGSCAN_ZONE_RECLAIM_SUCCESS); -- 2.25.1