On Mon 17-01-22 22:16:23, yukuai (C) wrote: > 在 2022/01/17 19:45, Jan Kara 写道: > > On Mon 17-01-22 16:13:09, yukuai (C) wrote: > > > 在 2022/01/15 1:01, Jan Kara 写道: > > > > Hello, > > > > > > > > here is the third version of my patches to fix use-after-free issues in BFQ > > > > when processes with merged queues get moved to different cgroups. The patches > > > > have survived some beating in my test VM, but so far I fail to reproduce the > > > > original KASAN reports so testing from people who can reproduce them is most > > > > welcome. Kuai, can you please give these patches a run in your setup? Thanks > > > > a lot for your help with fixing this! > > > > > > > > Changes since v3: > > > > * Changed handling of bfq group move to handle the case when target of the > > > > merge has moved. > > > Hi, Jan > > > > > > The problem can still be reporduced... > > > > Drat. Thanks for testing. > > > > > Do you implement this in patch 4? If so, I don't understand how you > > > chieve this. > > > > Yes, patch 4 should be handling this. > > > > > For example: 3 bfqqs were merged: > > > q1->q2->q3 > > > > > > If __bfq_bic_change_cgroup() is called with q2, the problem can be > > > fixed. However, if __bfq_bic_change_cgroup() is called with q3, can > > > the problem be fixed? > > > > Maybe I'm missing something so let's walk through your example where the > > bio is submitted for a task attached to q3. Bio is submitted, > > __bfq_bic_change_cgroup() is called with sync_bfqq == q3, the loop > > > > while (sync_bfqq->new_bfqq) > > sync_bfqq = sync_bfqq->new_bfqq; > > > > won't do anything. Then we check: > > > > if (sync_bfqq->entity.sched_data != &bfqg->sched_data) { > > > > This should be true because the task got moved and the bio is thus now > > submitted for a different cgroup. Then we get to the condition: > > > > if (orig_bfqq != sync_bfqq || bfq_bfqq_coop(sync_bfqq)) > > > > orig_bfqq == sync_bfqq so that won't help but the idea was that > > bfq_bfqq_coop() would trigger in this case so we detach the process from q3 > > (through bic_set_bfqq(bic, NULL, 1)) and create new queue in the right > > cgroup. Eventually, all the processes would detach from q3 and it would get > > destroyed. So why do you think this scheme will not work? > > Hi, Jan > > I repoduced again with some debug info: Thanks for your help with debugging! > diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c > index f6f5f156b9f2..f62ebc4bbe56 100644 > --- a/block/bfq-cgroup.c > +++ b/block/bfq-cgroup.c > @@ -732,8 +732,11 @@ static struct bfq_group *__bfq_bic_change_cgroup(struct > bfq_data *bfqd, > struct bfq_queue *orig_bfqq = sync_bfqq; > > /* Traverse the merge chain to bfqq we will be using */ > - while (sync_bfqq->new_bfqq) > + while (sync_bfqq->new_bfqq) { > + printk("%s: bfqq %px, new_bfqq %px\n", __func__, > + sync_bfqq, sync_bfqq->new_bfqq); > sync_bfqq = sync_bfqq->new_bfqq; > + } > /* > * Target bfqq got moved to a different cgroup or this > process > * started submitting bios for different cgroup? > @@ -756,6 +759,8 @@ static struct bfq_group *__bfq_bic_change_cgroup(struct > bfq_data *bfqd, > bic_set_bfqq(bic, NULL, 1); Maybe it would be nice to add a debug message here, saying we are detaching the process from orig_bfqq. Just that we know that this branch executed. > return bfqg; > } > + printk("%s: bfqq %px move from %px to %px\n", > __func__, > + sync_bfqq, bfqq_group(sync_bfqq), bfqg); > /* We are the only user of this bfqq, just move it > */ > bfq_bfqq_move(bfqd, sync_bfqq, bfqg); ... > - if (bfqq->new_bfqq) > + if (bfqq->new_bfqq) { > + if (bfqq->entity.parent != bfqq->new_bfqq->entity.parent) { > + printk("%s: bfqq %px(%px) new_bfqq %px(%px)\n", > __func__, > + bfqq, bfqq_group(bfqq), bfqq->new_bfqq, > + bfqq_group(bfqq->new_bfqq)); > + BUG_ON(1); OK, so I can see why this triggered. We have: Set new_bfqq for bfqq *eec0: [ 50.207263] bfq_setup_merge: set bfqq ffff88816b16eec0 new_bfqq ffff8881133e1340 parent ffff88814380b000/ffff88814380b000 Move new_bfqq to a root cgroup: [ 50.485933] bfq_reparent_leaf_entity: bfqq ffff8881133e1340 move from ffff88814380b000 to ffff88810b1f1000 Submit bio for root cgroup through *eec0: [ 50.519910] __bfq_bic_change_cgroup: bfqq ffff88816b16eec0, new_bfqq ffff8881133e1340 __bfq_bic_change_cgroup() does nothing as the bio is for the cgroup to which target queue belongs. [ 50.520640] bfq_setup_cooperator: bfqq ffff88816b16eec0(ffff88814380b000) new_bfqq ffff8881133e1340(ffff88810b1f1000) The BUG_ON trips. So at this moment the BUG_ON was just too eager to trigger as we would be submitting IO to a bfq queue belonging to an appropriate cgroup. But I guess it does not make sence to keep unfinished merges over cgroup migration and __bfq_bic_change_cgroup() should have detached task from its bfqq anyway. Can you please try running with a new version of patch 4 which is attached? And perhaps with your debug patch as well... Thanks! Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR
>From 2c32c74eccc94f1f5c689132423cbe5fc9616871 Mon Sep 17 00:00:00 2001 From: Jan Kara <jack@xxxxxxx> Date: Wed, 5 Jan 2022 14:21:04 +0100 Subject: [PATCH] bfq: Update cgroup information before merging bio When the process is migrated to a different cgroup (or in case of writeback just starts submitting bios associated with a different cgroup) bfq_merge_bio() can operate with stale cgroup information in bic. Thus the bio can be merged to a request from a different cgroup or it can result in merging of bfqqs for different cgroups or bfqqs of already dead cgroups and causing possible use-after-free issues. Fix the problem by updating cgroup information in bfq_merge_bio(). CC: stable@xxxxxxxxxxxxxxx Fixes: e21b7a0b9887 ("block, bfq: add full hierarchical scheduling and cgroups support") Signed-off-by: Jan Kara <jack@xxxxxxx> --- block/bfq-cgroup.c | 37 +++++++++++++++++++++---------------- block/bfq-iosched.c | 11 +++++++++-- 2 files changed, 30 insertions(+), 18 deletions(-) diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c index dbc117e00783..de4db8a0d796 100644 --- a/block/bfq-cgroup.c +++ b/block/bfq-cgroup.c @@ -729,30 +729,35 @@ static struct bfq_group *__bfq_bic_change_cgroup(struct bfq_data *bfqd, } if (sync_bfqq) { - entity = &sync_bfqq->entity; - if (entity->sched_data != &bfqg->sched_data) { + struct bfq_queue *target_bfqq = NULL; + + if (sync_bfqq->new_bfqq) + target_bfqq = sync_bfqq->new_bfqq; + /* + * Target bfqq got moved to a different cgroup or this process + * started submitting bios for different cgroup? + */ + if ((target_bfqq && + sync_bfqq->entity.parent != target_bfqq->entity.parent) || + sync_bfqq->entity.sched_data != &bfqg->sched_data) { /* * Was the queue we use merged to a different queue? - * Detach process from the queue as merge need not be - * valid anymore. We cannot easily cancel the merge as - * there may be other processes scheduled to this - * queue. + * Detach process from the queue as the merge is not + * valid anymore. We cannot easily just cancel the + * merge (by clearing new_bfqq) as there may be other + * processes using this queue and holding refs to all + * queues below sync_bfqq->new_bfqq. Similarly if the + * merge already happened, we need to detach from bfqq + * now so that we cannot merge bio to a request from + * the old cgroup. */ - if (sync_bfqq->new_bfqq) { + if (target_bfqq || bfq_bfqq_coop(sync_bfqq)) { bfq_put_cooperator(sync_bfqq); bfq_release_process_ref(bfqd, sync_bfqq); bic_set_bfqq(bic, NULL, 1); return bfqg; } - /* - * Moving bfqq that is shared with another process? - * Split the queues at the nearest occasion as the - * processes can be in different cgroups now. - */ - if (bfq_bfqq_coop(sync_bfqq)) { - bic->stably_merged = false; - bfq_mark_bfqq_split_coop(sync_bfqq); - } + /* We are the only user of this bfqq, just move it */ bfq_bfqq_move(bfqd, sync_bfqq, bfqg); } } diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index 361d321b012a..8a088d77a0b6 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -2337,10 +2337,17 @@ static bool bfq_bio_merge(struct request_queue *q, struct bio *bio, spin_lock_irq(&bfqd->lock); - if (bic) + if (bic) { + /* + * Make sure cgroup info is uptodate for current process before + * considering the merge. + */ + bfq_bic_update_cgroup(bic, bio); + bfqd->bio_bfqq = bic_to_bfqq(bic, op_is_sync(bio->bi_opf)); - else + } else { bfqd->bio_bfqq = NULL; + } bfqd->bio_bic = bic; ret = blk_mq_sched_try_merge(q, bio, nr_segs, &free); -- 2.31.1