Re: [Bug Report] EBUSY for cgroup rmdir after cgroup.procs empty

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi.

On Tue, Oct 24, 2023 at 04:10:32PM -0700, "T.J. Mercier" <tjmercier@xxxxxxxxxx> wrote:
> Back on this and pretty sure I discovered what's happening. For
> processes with multiple threads where each thread has reached
> atomic_dec_and_test(&tsk->signal->live) in do_exit (but not all have
> reached cgroup_exit yet), subsequent reads of cgroup.procs will skip
> over the process with not-yet-fully-exited thread group members
> because the read of task->signal->live evaluates to 0 here in
> css_task_iter_advance:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/cgroup/cgroup.c?h=v6.5#n4869

Well done! It sounds plausible, the task->signal->live is not synced
via css_set_lock.

> 
> But the cgroup is not removable yet because cgroup_exit hasn't been
> called for all tasks.
> 
> Since all tasks have been signaled in this case and we're just waiting
> for the exits to complete, I think it should be possible to turn the
> cgroup into a zombie on rmdir with the current behavior of
> cgroup.procs.

In this case it could be removed but it would make the check in
cgroup_destroy_locked() way too complicated (if I understand your idea).

> 
> Or if we change cgroup.procs to continue showing the thread group
> leader until all threads have finished exiting, we'd still probably
> have to change our userspace to accommodate the longer kill times
> exceeding our timeouts.

Provided this is the cause, you could get this more (timewise) precise
info from cgroup.threads already? (PR [1] has a reproducer and its fix
describes exactly opposite listings (confusing) but I think that fix
actually works because it checks cgroup.threads additionally.)

> So I'm going to change our userspace anyway as suggested by Tejun. But
> I'd be interested to hear what folks think about the potential kernel
> solutions as well.

Despite that, I'd stick with the notifications since they use rely on
proper synchronization of cgroup-info.

HTT,
Michal

[1] https://github.com/systemd/systemd/pull/23561

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux