Re: [PATCH 2/2] blk-cgroup: fix list corruption from reorder of WRITE ->lqueued

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, May 15, 2024 at 10:09:30AM -0400, Waiman Long wrote:
> 
> On 5/14/24 21:31, Ming Lei wrote:
> > __blkcg_rstat_flush() can be run anytime, especially when blk_cgroup_bio_start
> > is being executed.
> > 
> > If WRITE of `->lqueued` is re-ordered with READ of 'bisc->lnode.next' in
> > the loop of __blkcg_rstat_flush(), `next_bisc` can be assigned with one
> > stat instance being added in blk_cgroup_bio_start(), then the local
> > list in __blkcg_rstat_flush() could be corrupted.
> > 
> > Fix the issue by adding one barrier.
> > 
> > Cc: Tejun Heo <tj@xxxxxxxxxx>
> > Cc: Waiman Long <longman@xxxxxxxxxx>
> > Fixes: 3b8cc6298724 ("blk-cgroup: Optimize blkcg_rstat_flush()")
> > Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx>
> > ---
> >   block/blk-cgroup.c | 10 ++++++++++
> >   1 file changed, 10 insertions(+)
> > 
> > diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
> > index 86752b1652b5..b36ba1d40ba1 100644
> > --- a/block/blk-cgroup.c
> > +++ b/block/blk-cgroup.c
> > @@ -1036,6 +1036,16 @@ static void __blkcg_rstat_flush(struct blkcg *blkcg, int cpu)
> >   		struct blkg_iostat cur;
> >   		unsigned int seq;
> > +		/*
> > +		 * Order assignment of `next_bisc` from `bisc->lnode.next` in
> > +		 * llist_for_each_entry_safe and clearing `bisc->lqueued` for
> > +		 * avoiding to assign `next_bisc` with new next pointer added
> > +		 * in blk_cgroup_bio_start() in case of re-ordering.
> > +		 *
> > +		 * The pair barrier is implied in llist_add() in blk_cgroup_bio_start().
> > +		 */
> > +		smp_mb();
> > +
> >   		WRITE_ONCE(bisc->lqueued, false);
> 
> I believe replacing WRITE_ONCE() by smp_store_release() and the READ_ONCE()
> in blk_cgroup_bio_start() by smp_load_acquire() should provide enough
> barrier to prevent unexpected reordering as

Yeah, smp_load_acquire() and smp_store_release() pair works too, but with
one extra cost of smp_mb() around llist_add() which implies barrier
already.

So I prefer to this patch.

> the subsequent
> u64_stats_fetch_begin() will also provide a read barrier for subsequent
> read.

u64_stats_fetch_begin() is nop in case of `BITS_PER_LONG == 64`.


Thanks,
Ming





[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux