Re: [PATCH 3/4] bfq: Split shared queues on move between cgroups

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu 13-01-22 15:08:50, yukuai (C) wrote:
> 在 2022/01/13 11:57, yukuai (C) 写道:
> > 在 2022/01/12 19:39, Jan Kara 写道:
> > > When bfqq is shared by multiple processes it can happen that one of the
> > > processes gets moved to a different cgroup (or just starts submitting IO
> > > for different cgroup). In case that happens we need to split the merged
> > > bfqq as otherwise we will have IO for multiple cgroups in one bfqq and
> > > we will just account IO time to wrong entities etc.
> > > 
> > > Similarly if the bfqq is scheduled to merge with another bfqq but the
> > > merge didn't happen yet, cancel the merge as it need not be valid
> > > anymore.
> > > 
> > > CC: stable@xxxxxxxxxxxxxxx
> > > Fixes: e21b7a0b9887 ("block, bfq: add full hierarchical scheduling
> > > and cgroups support")
> > > Signed-off-by: Jan Kara <jack@xxxxxxx>
> > > ---
> > >   block/bfq-cgroup.c  | 25 ++++++++++++++++++++++++-
> > >   block/bfq-iosched.c |  2 +-
> > >   block/bfq-iosched.h |  1 +
> > >   3 files changed, 26 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c
> > > index 24a5c5329bcd..dbc117e00783 100644
> > > --- a/block/bfq-cgroup.c
> > > +++ b/block/bfq-cgroup.c
> > > @@ -730,8 +730,31 @@ static struct bfq_group
> > > *__bfq_bic_change_cgroup(struct bfq_data *bfqd,
> > >       if (sync_bfqq) {
> > >           entity = &sync_bfqq->entity;
> > > -        if (entity->sched_data != &bfqg->sched_data)
> > > +        if (entity->sched_data != &bfqg->sched_data) {
> > > +            /*
> > > +             * Was the queue we use merged to a different queue?
> > > +             * Detach process from the queue as merge need not be
> > > +             * valid anymore. We cannot easily cancel the merge as
> > > +             * there may be other processes scheduled to this
> > > +             * queue.
> > > +             */
> > > +            if (sync_bfqq->new_bfqq) {
> > > +                bfq_put_cooperator(sync_bfqq);
> > Hi,
> > 
> > The patch " bfq: Simplify bfq_put_cooperator()" in last version is not
> > in this patch set, thus bfq_put_cooperator() won't set
> > sync_bfqq->new_bfqq to NULL. So I guess the problem still exist?
> > 
> > Thanks,
> > Kuai
> > > +                bfq_release_process_ref(bfqd, sync_bfqq);
> > > +                bic_set_bfqq(bic, NULL, 1);
> Hi,
> 
> I understand now that you set NULL here,

Yes.

> however I still can repoduce
> the problem with this patch set applied.

OK.

> Here I add some debug message:
> 
> diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c
> index dbc117e00783..2d3da8e73ec0 100644
> --- a/block/bfq-cgroup.c
> +++ b/block/bfq-cgroup.c
> @@ -754,6 +754,7 @@ static struct bfq_group *__bfq_bic_change_cgroup(struct
> bfq_data *bfqd,
>                                 bfq_mark_bfqq_split_coop(sync_bfqq);
>                         }
>                         bfq_bfqq_move(bfqd, sync_bfqq, bfqg);
> +                       printk("%s: bfqq %px move to %px\n", __func__,
> sync_bfqq, bfqg);
>                 }
>         }
> 
> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
> index 07be51bc229b..74d5b575626f 100644
> --- a/block/bfq-iosched.c
> +++ b/block/bfq-iosched.c
> @@ -2797,6 +2797,7 @@ bfq_setup_merge(struct bfq_queue *bfqq, struct
> bfq_queue *new_bfqq)
>          * are likely to increase the throughput.
>          */
>         bfqq->new_bfqq = new_bfqq;
> +       printk("%s: bfqq %px new_bfqq %px bfqg %px\n", __func__, bfqq,
> new_bfqq, bfqq_group(bfqq));
>         new_bfqq->ref += process_refs;
>         return new_bfqq;
>  }
> @@ -2963,8 +2964,14 @@ bfq_setup_cooperator(struct bfq_data *bfqd, struct
> bfq_queue *bfqq,
>         if (bfq_too_late_for_merging(bfqq))
>                 return NULL;
> 
> -       if (bfqq->new_bfqq)
> +       if (bfqq->new_bfqq) {
> +               if (bfqq->entity.parent != bfqq->new_bfqq->entity.parent) {
> +                       printk("%s: bfqq %px (%px), new_bfqq %px (%px)\n",
> __func__,
> +                                       bfqq, bfqq_group(bfqq),
> bfqq->new_bfqq, bfqq_group(bfqq->new_bfqq));
> +                       BUG_ON(1);
> +               }
>                 return bfqq->new_bfqq;
> +       }
> 
>         if (!io_struct || unlikely(bfqq == &bfqd->oom_bfqq))
>                 return NULL;
> 
> And here is the messages when BUG_ON is triggered(much easier than the
> uaf problem):
> 
> [  157.237821] bfq_setup_merge: bfqq ffff888140654dc0 new_bfqq
> ffff888174122100 bfqg ffff88817abc6000
> [  157.238739] __bfq_bic_change_cgroup: bfqq ffff888174122100 move to
> ffff888104b7c000
> [  157.239675] bfq_setup_cooperator: bfqq ffff888140654dc0
> (ffff88817abc6000), new_bfqq ffff888174122100 (ffff888104b7c000)

Thanks for the debugging! So this shows that it is not 'bfqq' that changed
parent but rather bfqq->new_bfqq that changed parent. And presumably it can
even happen that the bfqq->new_bfqq parent gets offlined and UAF eventually
triggered. Indeed, I didn't think about that possibility. I have to think
how to handle this in the best way...

								Honza
-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux