On 5/15/24 10:09, Waiman Long wrote:
On 5/14/24 21:31, Ming Lei wrote:
__blkcg_rstat_flush() can be run anytime, especially when
blk_cgroup_bio_start
is being executed.
If WRITE of `->lqueued` is re-ordered with READ of 'bisc->lnode.next' in
the loop of __blkcg_rstat_flush(), `next_bisc` can be assigned with one
stat instance being added in blk_cgroup_bio_start(), then the local
list in __blkcg_rstat_flush() could be corrupted.
Fix the issue by adding one barrier.
Cc: Tejun Heo <tj@xxxxxxxxxx>
Cc: Waiman Long <longman@xxxxxxxxxx>
Fixes: 3b8cc6298724 ("blk-cgroup: Optimize blkcg_rstat_flush()")
Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx>
---
block/blk-cgroup.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 86752b1652b5..b36ba1d40ba1 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -1036,6 +1036,16 @@ static void __blkcg_rstat_flush(struct blkcg
*blkcg, int cpu)
struct blkg_iostat cur;
unsigned int seq;
+ /*
+ * Order assignment of `next_bisc` from `bisc->lnode.next` in
+ * llist_for_each_entry_safe and clearing `bisc->lqueued` for
+ * avoiding to assign `next_bisc` with new next pointer added
+ * in blk_cgroup_bio_start() in case of re-ordering.
+ *
+ * The pair barrier is implied in llist_add() in
blk_cgroup_bio_start().
+ */
+ smp_mb();
+
WRITE_ONCE(bisc->lqueued, false);
I believe replacing WRITE_ONCE() by smp_store_release() and the
READ_ONCE() in blk_cgroup_bio_start() by smp_load_acquire() should
provide enough barrier to prevent unexpected reordering as the
subsequent u64_stats_fetch_begin() will also provide a read barrier
for subsequent read.
We can also use a smp_acquire__after_ctrl_dep() after the READ_ONCE() in
blk_cgroup_bio_start() instead of a full smp_load_acquire() to optimize
it a bit more.
Cheers,
Longman