Re: Use after free with BFQ and cgroups

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello.

On Thu, Nov 25, 2021 at 06:28:09PM +0100, Jan Kara <jack@xxxxxxx> wrote:
[...]
+Cc cgroups ML
https://lore.kernel.org/linux-block/20211125172809.GC19572@xxxxxxxxxxxxxx/


I understand there are more objects than blkcgs but I assume it can
eventually boil down to blkcg references, so I suggest another
alternative. (But I may easily miss the relations between BFQ objects,
so consider this only high-level opinion.)

> After some poking, looking into crashdumps, and applying some debug patches
> the following seems to be happening: We have a process P in blkcg G. Now
> G is taken offline so bfq_group is cleaned up in bfq_pd_offline() but P
> still holds reference to G from its bfq_queue. Then P submits IO, G gets
> inserted into service tree despite being already offline.

(If G is offline, P can only be zombie, just saying. (I guess it can
still be Q's IO on behalf of G.))

IIUC, the reference to G is only held by P. If the G reference is copied
into another structure (the service tree) it should get another
reference. My naïve proposal would be css_get(). (1)

> IO completes, P exits, bfq_queue pointing to G gets destroyed, the
> last reference to G is dropped, G gets freed although is it still
> inserted in the service tree.  Eventually someone trips over the freed
> memory.

Isn't it the bfq_queue.bfq_entity that's inserted in the service tree
(not blkcg G)?
You write bfq_queue is destroyed, shouldn't that remove it from the
service tree? (2)

> Now I was looking into how to best fix this. There are several
> possibilities and I'm not sure which one to pick so that's why I'm writing
> to you. bfq_pd_offline() is walking all entities in service trees and
> trying to get rid of references to bfq_group (by reparenting entities).
> Is this guaranteed to see all entities that point to G? From the scenario
> I'm observing it seems this can miss entities pointing to G - e.g. if they
> are in idle tree, we will just remove them from the idle tree but we won't
> change entity->parent so they still point to G. This can be seen as one
> culprit of the bug.

There can be two types of references to blkcg (transitively via
bfq_group):
a) "plain" (just a pointer stored somewhere),
b) "pinned" (marked by css_get() of the respective blkcg).

The bfq_pd_offline() callback should erase all plain references (e.g. by
reparenting) or poke the holders of pinned references to release (unpin)
them eventually (so that blkcg goes away).

I reckon it's not possible to traverse all references in the
bfq_pd_offline().

> Or alternatively, should we e.g. add __bfq_deactivate_entity() to
> bfq_put_queue() when that function is dropping last queue in a bfq_group?

I guess this is what I wondered about in (2). (But I'm not sure this
really is proof against subsequent re-insertions into the tree.)

> Or should we just reparent bfq queues that have already dead parent on
> activation?

If (1) used css_tryget_online(), the parent (or ancestor if it happened
to be offlined too) could be the fallback.

> What's your opinion?

The question here is how long would stay the offlined blkcgs around if
they were directly pinned upon the IO submission. If it's unbound, then
reparenting makes more sense.


Michal



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux