Re: [PATCH] bfq: Fix use-after-free with cgroups

Michal Koutný <mkoutny@xxxxxxxx> · Tue, 7 Dec 2021 20:08:43 +0100

On Wed, Dec 01, 2021 at 02:34:39PM +0100, Jan Kara <jack@xxxxxxx> wrote:
> After some analysis we've found out that the culprit of the problem is
> that some task is reparented from cgroup G to the root cgroup and G is
> offlined.

Just sharing my interpretation for context -- (I saw this was a system
using the unified cgroup hierarchy, io_cgrp_subsys_on_dfl_key was
enabled) and what was observed could also have been disabling the io
controller on given level -- that would also manifest similarly -- the
task is migrated to parent and the former blkcg is offlined.

> +static void bfq_reparent_children(struct bfq_data *bfqd, struct bfq_group *bfqg)
> [...]
> -	bfq_bfqq_move(bfqd, bfqq, bfqd->root_group);
> [...]
> +	hlist_for_each_entry_safe(bfqq, next, &bfqg->children, children_node)
> +		bfq_bfqq_move(bfqd, bfqq, bfqd->root_group);

Here I assume root_group is (representing) the global blkcg root and
this reparenting thus skips all ancestors between the removed leaf and
the root. IIUC the associated io_context would then be treated as if it
was running in the root blkcg.
(Admittedly, this isn't a change from this patch but it may cause some
surprises if the given process runs after the operation.)

Reparenting to the immediate ancestors should be safe as cgroup core
should ensure children are offlined before parents. Would it make sense
to you?

> @@ -897,38 +844,17 @@ static void bfq_pd_offline(struct blkg_policy_data *pd)
> [...]
> -		 * It may happen that some queues are still active
> -		 * (busy) upon group destruction (if the corresponding
> -		 * processes have been forced to terminate). We move
> -		 * all the leaf entities corresponding to these queues
> -		 * to the root_group.

This comment is removed but it seems to me it assumed that the
reparented entities are only some transitional remainings of terminated
tasks but they may be the processes migrated upwards with a long (IO
active) life ahead.