[PATCH] xfs: fix race condition in inodegc list and cpumask handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



There is a race condition between inodegc queue and inodegc worker where
the cpumask bit may not be set when concurrent operations occur.

Current problematic sequence:

  CPU0                             CPU1
  --------------------             ---------------------
  xfs_inodegc_queue()              xfs_inodegc_worker()
                                     llist_del_all(&gc->list)
    llist_add(&ip->i_gclist, &gc->list)
    cpumask_test_and_set_cpu()
                                     cpumask_clear_cpu()
                  < cpumask not set >

Fix this by moving llist_del_all() after cpumask_clear_cpu() to ensure
proper ordering. This change ensures that when the worker thread clears
the cpumask, any concurrent queue operations will either properly set
the cpumask bit or have already emptied the list.

Also remove unnecessary smp_mb__{before/after}_atomic() barriers since
the llist_* operations already provide required ordering semantics. it
make the code cleaner.

Fixes: 62334fab4762 ("xfs: use per-mount cpumask to track nonempty percpu inodegc lists")
Reported-by: Hou Tao <houtao1@xxxxxxxxxx>
Signed-off-by: Long Li <leo.lilong@xxxxxxxxxx>
---
 fs/xfs/xfs_icache.c | 25 ++++++++++++-------------
 1 file changed, 12 insertions(+), 13 deletions(-)

diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index 7b6c026d01a1..fba784d7a146 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -1948,19 +1948,20 @@ xfs_inodegc_worker(
 {
 	struct xfs_inodegc	*gc = container_of(to_delayed_work(work),
 						struct xfs_inodegc, work);
-	struct llist_node	*node = llist_del_all(&gc->list);
+	struct llist_node	*node;
 	struct xfs_inode	*ip, *n;
 	struct xfs_mount	*mp = gc->mp;
 	unsigned int		nofs_flag;
 
+	cpumask_clear_cpu(gc->cpu, &mp->m_inodegc_cpumask);
+
 	/*
-	 * Clear the cpu mask bit and ensure that we have seen the latest
-	 * update of the gc structure associated with this CPU. This matches
-	 * with the release semantics used when setting the cpumask bit in
-	 * xfs_inodegc_queue.
+	 * llist_del_all provides ordering semantics that ensure the CPU mask
+	 * clearing is always visible before removing all entries from the gc
+	 * list. This prevents list being added while the CPU mask bit is
+	 * unset in xfs_inodegc_queue()
 	 */
-	cpumask_clear_cpu(gc->cpu, &mp->m_inodegc_cpumask);
-	smp_mb__after_atomic();
+	node = llist_del_all(&gc->list);
 
 	WRITE_ONCE(gc->items, 0);
 
@@ -2188,13 +2189,11 @@ xfs_inodegc_queue(
 	shrinker_hits = READ_ONCE(gc->shrinker_hits);
 
 	/*
-	 * Ensure the list add is always seen by anyone who finds the cpumask
-	 * bit set. This effectively gives the cpumask bit set operation
-	 * release ordering semantics.
+	 * llist_add provides necessary ordering semantics to ensure list
+	 * additions are visible when cpumask bit is found set, so no
+	 * additional memory barrier is needed.
 	 */
-	smp_mb__before_atomic();
-	if (!cpumask_test_cpu(cpu_nr, &mp->m_inodegc_cpumask))
-		cpumask_test_and_set_cpu(cpu_nr, &mp->m_inodegc_cpumask);
+	cpumask_test_and_set_cpu(cpu_nr, &mp->m_inodegc_cpumask);
 
 	/*
 	 * We queue the work while holding the current CPU so that the work
-- 
2.39.2





[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux