On Wed, May 15, 2024 at 10:09:30AM -0400, Waiman Long wrote: > > On 5/14/24 21:31, Ming Lei wrote: > > __blkcg_rstat_flush() can be run anytime, especially when blk_cgroup_bio_start > > is being executed. > > > > If WRITE of `->lqueued` is re-ordered with READ of 'bisc->lnode.next' in > > the loop of __blkcg_rstat_flush(), `next_bisc` can be assigned with one > > stat instance being added in blk_cgroup_bio_start(), then the local > > list in __blkcg_rstat_flush() could be corrupted. > > > > Fix the issue by adding one barrier. > > > > Cc: Tejun Heo <tj@xxxxxxxxxx> > > Cc: Waiman Long <longman@xxxxxxxxxx> > > Fixes: 3b8cc6298724 ("blk-cgroup: Optimize blkcg_rstat_flush()") > > Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx> > > --- > > block/blk-cgroup.c | 10 ++++++++++ > > 1 file changed, 10 insertions(+) > > > > diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c > > index 86752b1652b5..b36ba1d40ba1 100644 > > --- a/block/blk-cgroup.c > > +++ b/block/blk-cgroup.c > > @@ -1036,6 +1036,16 @@ static void __blkcg_rstat_flush(struct blkcg *blkcg, int cpu) > > struct blkg_iostat cur; > > unsigned int seq; > > + /* > > + * Order assignment of `next_bisc` from `bisc->lnode.next` in > > + * llist_for_each_entry_safe and clearing `bisc->lqueued` for > > + * avoiding to assign `next_bisc` with new next pointer added > > + * in blk_cgroup_bio_start() in case of re-ordering. > > + * > > + * The pair barrier is implied in llist_add() in blk_cgroup_bio_start(). > > + */ > > + smp_mb(); > > + > > WRITE_ONCE(bisc->lqueued, false); > > I believe replacing WRITE_ONCE() by smp_store_release() and the READ_ONCE() > in blk_cgroup_bio_start() by smp_load_acquire() should provide enough > barrier to prevent unexpected reordering as Yeah, smp_load_acquire() and smp_store_release() pair works too, but with one extra cost of smp_mb() around llist_add() which implies barrier already. So I prefer to this patch. > the subsequent > u64_stats_fetch_begin() will also provide a read barrier for subsequent > read. u64_stats_fetch_begin() is nop in case of `BITS_PER_LONG == 64`. Thanks, Ming