Hi. On Tue, Oct 24, 2023 at 04:10:32PM -0700, "T.J. Mercier" <tjmercier@xxxxxxxxxx> wrote: > Back on this and pretty sure I discovered what's happening. For > processes with multiple threads where each thread has reached > atomic_dec_and_test(&tsk->signal->live) in do_exit (but not all have > reached cgroup_exit yet), subsequent reads of cgroup.procs will skip > over the process with not-yet-fully-exited thread group members > because the read of task->signal->live evaluates to 0 here in > css_task_iter_advance: > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/cgroup/cgroup.c?h=v6.5#n4869 Well done! It sounds plausible, the task->signal->live is not synced via css_set_lock. > > But the cgroup is not removable yet because cgroup_exit hasn't been > called for all tasks. > > Since all tasks have been signaled in this case and we're just waiting > for the exits to complete, I think it should be possible to turn the > cgroup into a zombie on rmdir with the current behavior of > cgroup.procs. In this case it could be removed but it would make the check in cgroup_destroy_locked() way too complicated (if I understand your idea). > > Or if we change cgroup.procs to continue showing the thread group > leader until all threads have finished exiting, we'd still probably > have to change our userspace to accommodate the longer kill times > exceeding our timeouts. Provided this is the cause, you could get this more (timewise) precise info from cgroup.threads already? (PR [1] has a reproducer and its fix describes exactly opposite listings (confusing) but I think that fix actually works because it checks cgroup.threads additionally.) > So I'm going to change our userspace anyway as suggested by Tejun. But > I'd be interested to hear what folks think about the potential kernel > solutions as well. Despite that, I'd stick with the notifications since they use rely on proper synchronization of cgroup-info. HTT, Michal [1] https://github.com/systemd/systemd/pull/23561
Attachment:
signature.asc
Description: PGP signature