Re: [PATCH 2/2] blk-cgroup: fix list corruption from reorder of WRITE ->lqueued

Waiman Long <longman@xxxxxxxxxx> · Wed, 15 May 2024 10:09:30 -0400

On 5/14/24 21:31, Ming Lei wrote:
__blkcg_rstat_flush() can be run anytime, especially when blk_cgroup_bio_start
is being executed.

If WRITE of `->lqueued` is re-ordered with READ of 'bisc->lnode.next' in
the loop of __blkcg_rstat_flush(), `next_bisc` can be assigned with one
stat instance being added in blk_cgroup_bio_start(), then the local
list in __blkcg_rstat_flush() could be corrupted.

Fix the issue by adding one barrier.

Cc: Tejun Heo <tj@xxxxxxxxxx>
Cc: Waiman Long <longman@xxxxxxxxxx>
Fixes: 3b8cc6298724 ("blk-cgroup: Optimize blkcg_rstat_flush()")
Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx>
---
  block/blk-cgroup.c | 10 ++++++++++
  1 file changed, 10 insertions(+)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 86752b1652b5..b36ba1d40ba1 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -1036,6 +1036,16 @@ static void __blkcg_rstat_flush(struct blkcg *blkcg, int cpu)
  		struct blkg_iostat cur;
  		unsigned int seq;
  
+		/*
+		 * Order assignment of `next_bisc` from `bisc->lnode.next` in
+		 * llist_for_each_entry_safe and clearing `bisc->lqueued` for
+		 * avoiding to assign `next_bisc` with new next pointer added
+		 * in blk_cgroup_bio_start() in case of re-ordering.
+		 *
+		 * The pair barrier is implied in llist_add() in blk_cgroup_bio_start().
+		 */
+		smp_mb();
+
  		WRITE_ONCE(bisc->lqueued, false);

I believe replacing WRITE_ONCE() by smp_store_release() and the 
READ_ONCE() in blk_cgroup_bio_start() by smp_load_acquire() should 
provide enough barrier to prevent unexpected reordering as the 
subsequent u64_stats_fetch_begin() will also provide a read barrier for 
subsequent read.

Cheers,
Longman

  
  		/* fetch the current per-cpu values */