Re: [PATCH 2/2] blk-cgroup: fix list corruption from reorder of WRITE ->lqueued

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 5/15/24 10:09, Waiman Long wrote:

On 5/14/24 21:31, Ming Lei wrote:
__blkcg_rstat_flush() can be run anytime, especially when blk_cgroup_bio_start
is being executed.

If WRITE of `->lqueued` is re-ordered with READ of 'bisc->lnode.next' in
the loop of __blkcg_rstat_flush(), `next_bisc` can be assigned with one
stat instance being added in blk_cgroup_bio_start(), then the local
list in __blkcg_rstat_flush() could be corrupted.

Fix the issue by adding one barrier.

Cc: Tejun Heo <tj@xxxxxxxxxx>
Cc: Waiman Long <longman@xxxxxxxxxx>
Fixes: 3b8cc6298724 ("blk-cgroup: Optimize blkcg_rstat_flush()")
Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx>
---
  block/blk-cgroup.c | 10 ++++++++++
  1 file changed, 10 insertions(+)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 86752b1652b5..b36ba1d40ba1 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -1036,6 +1036,16 @@ static void __blkcg_rstat_flush(struct blkcg *blkcg, int cpu)
          struct blkg_iostat cur;
          unsigned int seq;
  +        /*
+         * Order assignment of `next_bisc` from `bisc->lnode.next` in
+         * llist_for_each_entry_safe and clearing `bisc->lqueued` for
+         * avoiding to assign `next_bisc` with new next pointer added
+         * in blk_cgroup_bio_start() in case of re-ordering.
+         *
+         * The pair barrier is implied in llist_add() in blk_cgroup_bio_start().
+         */
+        smp_mb();
+
          WRITE_ONCE(bisc->lqueued, false);

I believe replacing WRITE_ONCE() by smp_store_release() and the READ_ONCE() in blk_cgroup_bio_start() by smp_load_acquire() should provide enough barrier to prevent unexpected reordering as the subsequent u64_stats_fetch_begin() will also provide a read barrier for subsequent read.

We can also use a smp_acquire__after_ctrl_dep() after the READ_ONCE() in blk_cgroup_bio_start() instead of a full smp_load_acquire() to optimize it a bit more.

Cheers,
Longman





[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux