Re: [Bug Report] EBUSY for cgroup rmdir after cgroup.procs empty

Michal Koutný <mkoutny@xxxxxxxx> · Tue, 10 Oct 2023 18:31:18 +0200

On Fri, Oct 06, 2023 at 11:37:19AM -0700, "T.J. Mercier" <tjmercier@xxxxxxxxxx> wrote:
> I suppose it's also possible there is PID reuse by the same app,
> causing the cgroup to become repopulated at the same time as a kill,
> but that seems extremely unlikely. Plus, at the point where these
> kills are occurring we shouldn't normally be simultaneously launching
> new processes for the app. Similarly if a process forks right before
> it is killed, maybe it doesn't show up in cgroup.procs until after
> we've observed it to be empty?

Something like this:

							kill (before)
cgroup_fork
cgroup_can_fork .. begin(threadgroup_rwsem)
tasklist_lock
fatal_signal_pending -> cgroup_cancel_fork		kill (mid)
tasklist_unlock
							seq_start,
							seq_next...

cgroup_post_fork  .. end(threadgroup_rwsem)		
							kill (after)

Only the third option `kill (after)` means the child would end up on the
css_set list. But that would mean the reader squeezed before
cgroup_post_fork() would still see the parent.
(I.e. I don't see the kill/fork race could skew the listed procs.)

(But it reminds me another pathological case of "group leader
 separation" where:
- there is a multithreaded process,
- threadgroup leader exits,
- threadgroup is migrated from A to B (write to cgroup.procs)
  - but leader stays in A (because it has PF_EXITING),
- A will still show it in cgroup.procs,
- B will not include it in cgroup.procs despite it hosts some threads of
  the threadgroup (effectively populated).
It's been some time, I didn't check if it's still possible nowadays.)

BTW is there any fundamental reason the apps cannot use the
notifications via cgroup.events as recommended by Tejun?

Thanks,
Michal
Attachment:
signature.asc

Description: PGP signature