Re: [PATCH 3/9] bfq: Split shared queues on move between cgroups

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu 08-12-22 11:52:38, Yu Kuai wrote:
> Hi, Jan!
> 
> 在 2022/03/30 20:42, Jan Kara 写道:
> > When bfqq is shared by multiple processes it can happen that one of the
> > processes gets moved to a different cgroup (or just starts submitting IO
> > for different cgroup). In case that happens we need to split the merged
> > bfqq as otherwise we will have IO for multiple cgroups in one bfqq and
> > we will just account IO time to wrong entities etc.
> > 
> > Similarly if the bfqq is scheduled to merge with another bfqq but the
> > merge didn't happen yet, cancel the merge as it need not be valid
> > anymore.
> > 
> > CC: stable@xxxxxxxxxxxxxxx
> > Fixes: e21b7a0b9887 ("block, bfq: add full hierarchical scheduling and cgroups support")
> > Signed-off-by: Jan Kara <jack@xxxxxxx>
> > ---
> >   block/bfq-cgroup.c  | 36 +++++++++++++++++++++++++++++++++---
> >   block/bfq-iosched.c |  2 +-
> >   block/bfq-iosched.h |  1 +
> >   3 files changed, 35 insertions(+), 4 deletions(-)
> > 
> > diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c
> > index 420eda2589c0..9352f3cc2377 100644
> > --- a/block/bfq-cgroup.c
> > +++ b/block/bfq-cgroup.c
> > @@ -743,9 +743,39 @@ static struct bfq_group *__bfq_bic_change_cgroup(struct bfq_data *bfqd,
> >   	}
> >   	if (sync_bfqq) {
> > -		entity = &sync_bfqq->entity;
> > -		if (entity->sched_data != &bfqg->sched_data)
> > -			bfq_bfqq_move(bfqd, sync_bfqq, bfqg);
> > +		if (!sync_bfqq->new_bfqq && !bfq_bfqq_coop(sync_bfqq)) {
> > +			/* We are the only user of this bfqq, just move it */
> > +			if (sync_bfqq->entity.sched_data != &bfqg->sched_data)
> > +				bfq_bfqq_move(bfqd, sync_bfqq, bfqg);
> > +		} else {
> > +			struct bfq_queue *bfqq;
> > +
> > +			/*
> > +			 * The queue was merged to a different queue. Check
> > +			 * that the merge chain still belongs to the same
> > +			 * cgroup.
> > +			 */
> > +			for (bfqq = sync_bfqq; bfqq; bfqq = bfqq->new_bfqq)
> > +				if (bfqq->entity.sched_data !=
> > +				    &bfqg->sched_data)
> > +					break;
> > +			if (bfqq) {
> > +				/*
> > +				 * Some queue changed cgroup so the merge is
> > +				 * not valid anymore. We cannot easily just
> > +				 * cancel the merge (by clearing new_bfqq) as
> > +				 * there may be other processes using this
> > +				 * queue and holding refs to all queues below
> > +				 * sync_bfqq->new_bfqq. Similarly if the merge
> > +				 * already happened, we need to detach from
> > +				 * bfqq now so that we cannot merge bio to a
> > +				 * request from the old cgroup.
> > +				 */
> > +				bfq_put_cooperator(sync_bfqq);
> > +				bfq_release_process_ref(bfqd, sync_bfqq);
> > +				bic_set_bfqq(bic, NULL, 1);
> > +			}
> > +		}
> >   	}
> Our test found a use-after-free while accessing bfqq->bic->bfqq[] ([1]),
> and I really suspect the above change.

OK, so bfqq points to bfq_io_cq that is already freed. Nasty.

> 1) init state, 2 thread, 2 bic, and 2 bfqq
> 
> bic1->bfqq = bfqq1
> bfqq1->bic = bic1
> bic2->bfqq = bfqq2
> bfqq2->bic = bic2
> 
> 2) bfqq1 and bfqq2 is merged
> bic1->bfqq = bfqq2
> bfqq1->bic = NULL
> bic2->bfqq = bfqq2
> bfqq2->bic = NULL
> 
> 3) bfqq1 issue a io, before such io reaches bfq, move t1 to a new
> cgroup, and issues a new io. If the new io reaches bfq first:

What do you mean by 'bfqq1 issues IO'? Do you mean t1?

> bic1->bfqq = NULL
> bfqq1->bic = NULL
> bic2->bfqq = bfqq2
> bfqq2->bic = NULL
> 
> 4) while handling the new io, a new bfqq is created
> bic1->bfqq = bfqq3
> bfqq3->bic = bic1
> bic2->bfqq = bfqq2
> bfqq2->bic = NULL
> 
> 5) the old io reaches bfq, corresponding bic is got from rq:
> bic1->bfqq = bfqq3
> bfqq3->bic = bic1
> bic2->bfqq = bfqq2
> bfqq2->bic = bic1

So if this state happens, it would be indeed a problem. But I don't see how
it could happen. bics are associated with the process. So t1 will always
use bic1, t2 will always use bic2. In bfq_init_rq() we get bfqq either from
bic (so it would return bfqq3 for bic1) or we allocate a new queue (that
would be some bfqq4). So I see no way how bfqq2 could start pointing to
bic1...

								Honza
-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux