Re: [PATCH] bfq: Fix use-after-free with cgroups

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue 07-12-21 20:08:43, Michal Koutný wrote:
> On Wed, Dec 01, 2021 at 02:34:39PM +0100, Jan Kara <jack@xxxxxxx> wrote:
> > After some analysis we've found out that the culprit of the problem is
> > that some task is reparented from cgroup G to the root cgroup and G is
> > offlined.
> 
> Just sharing my interpretation for context -- (I saw this was a system
> using the unified cgroup hierarchy, io_cgrp_subsys_on_dfl_key was
> enabled) and what was observed could also have been disabling the io
> controller on given level -- that would also manifest similarly -- the
> task is migrated to parent and the former blkcg is offlined.

Yes, that's another possibility.

> > +static void bfq_reparent_children(struct bfq_data *bfqd, struct bfq_group *bfqg)
> > [...]
> > -	bfq_bfqq_move(bfqd, bfqq, bfqd->root_group);
> > [...]
> > +	hlist_for_each_entry_safe(bfqq, next, &bfqg->children, children_node)
> > +		bfq_bfqq_move(bfqd, bfqq, bfqd->root_group);
> 
> Here I assume root_group is (representing) the global blkcg root and
> this reparenting thus skips all ancestors between the removed leaf and
> the root. IIUC the associated io_context would then be treated as if it
> was running in the root blkcg.
> (Admittedly, this isn't a change from this patch but it may cause some
> surprises if the given process runs after the operation.)

Yes, this is what happens in bfq_reparent_children() and basically
preserves what BFQ was already doing for a subset of bfq queues.

> Reparenting to the immediate ancestors should be safe as cgroup core
> should ensure children are offlined before parents. Would it make sense
> to you?

I suppose yes, it makes more sense to reparent just to immediate parents
instead of the root of the blkcg hierarchy. Initially when developing the
patch I was not sure whether parent has to be still alive but as you write
it should be safe. I'll modify the patch to:

static void bfq_reparent_children(struct bfq_data *bfqd, struct bfq_group *bfqg)
{
        struct bfq_queue *bfqq;
        struct hlist_node *next;
        struct bfq_group *parent;

        parent = bfqg_parent(bfqg);
        if (!parent)
                parent = bfqd->root_group;

        hlist_for_each_entry_safe(bfqq, next, &bfqg->children, children_node)
                bfq_bfqq_move(bfqd, bfqq, parent);
}

 
> > @@ -897,38 +844,17 @@ static void bfq_pd_offline(struct blkg_policy_data *pd)
> > [...]
> > -		 * It may happen that some queues are still active
> > -		 * (busy) upon group destruction (if the corresponding
> > -		 * processes have been forced to terminate). We move
> > -		 * all the leaf entities corresponding to these queues
> > -		 * to the root_group.
> 
> This comment is removed but it seems to me it assumed that the
> reparented entities are only some transitional remainings of terminated
> tasks but they may be the processes migrated upwards with a long (IO
> active) life ahead.

Yes, this seemed to be a misconception of the old code...

								Honza

-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux