Re: [Bug Report] EBUSY for cgroup rmdir after cgroup.procs empty

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Oct 10, 2023 at 9:31 AM Michal Koutný <mkoutny@xxxxxxxx> wrote:
>
> On Fri, Oct 06, 2023 at 11:37:19AM -0700, "T.J. Mercier" <tjmercier@xxxxxxxxxx> wrote:
> > I suppose it's also possible there is PID reuse by the same app,
> > causing the cgroup to become repopulated at the same time as a kill,
> > but that seems extremely unlikely. Plus, at the point where these
> > kills are occurring we shouldn't normally be simultaneously launching
> > new processes for the app. Similarly if a process forks right before
> > it is killed, maybe it doesn't show up in cgroup.procs until after
> > we've observed it to be empty?
>
> Something like this:
>
>                                                         kill (before)
> cgroup_fork
> cgroup_can_fork .. begin(threadgroup_rwsem)
> tasklist_lock
> fatal_signal_pending -> cgroup_cancel_fork              kill (mid)
> tasklist_unlock
>                                                         seq_start,
>                                                         seq_next...
>
> cgroup_post_fork  .. end(threadgroup_rwsem)
>                                                         kill (after)
>
> Only the third option `kill (after)` means the child would end up on the
> css_set list. But that would mean the reader squeezed before
> cgroup_post_fork() would still see the parent.
> (I.e. I don't see the kill/fork race could skew the listed procs.)
>
So here is a trace from a phone where the kills happen (~100ms) after
the forks. All but one of the children die before we read cgroup.procs
for the first time, and cgroup.procs is not empty. 5ms later we read
again and cgroup.procs is empty, but the last child still hasn't
exited. So it makes sense that the cset from that last child is still
on the list.
https://pastebin.com/raw/tnHhnZBE

> (But it reminds me another pathological case of "group leader
>  separation" where:
> - there is a multithreaded process,
> - threadgroup leader exits,
> - threadgroup is migrated from A to B (write to cgroup.procs)
>   - but leader stays in A (because it has PF_EXITING),
> - A will still show it in cgroup.procs,
> - B will not include it in cgroup.procs despite it hosts some threads of
>   the threadgroup (effectively populated).
> It's been some time, I didn't check if it's still possible nowadays.)
>
I don't think this is what 's happening because these processes only
get migrated once in their lifetimes (from the zygote cgroup to their
per-application cgroup), and the user/app code doesn't run until after
the migration completes which is the first time a thread could be
created.

> BTW is there any fundamental reason the apps cannot use the
> notifications via cgroup.events as recommended by Tejun?
>
This would require that we read both cgroup.procs and cgroup.events,
since we'd still want to know which processes to signal. I assumed
this would increase lock contention but there's no synchronization on
cgroup_is_populated so it looks like not. I had already identified
this as a workaround, but I'd prefer to depend on just one file to do
everything.




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux