On Mon 29-11-21 07:12:42, Tejun Heo wrote: > On Fri, Nov 26, 2021 at 03:47:24PM +0100, Michal Koutný wrote: > > The question here is how long would stay the offlined blkcgs around if > > they were directly pinned upon the IO submission. If it's unbound, then > > reparenting makes more sense. > > It should be fine to pin whatever's necessary while related IOs are in > flight and percpu_ref used for css refcnting isn't gonna make any noticeable > difference in terms of overhead. Yes, holding cgroup ref from IO would be fine. But that is not really our problem. The problem is bfq_queue associated with a task effectively holds a reference to the potentially dead cgroup and the reference can stay there until the task (that itself got reparented to the root cgroup) exits. So I think we need to reparent these bfq_queue structures as well to avoid holding cgroup in zombie state excessively long. Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR