Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> writes: > On Sat, 25 Jan 2025 at 10:12, Linus Torvalds > <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: >> >> Arguably the user space oddity is just strange and Paolo even calls it >> a bug, but at the same time, I do think user space can and should >> reasonably expect that it only has children that it created >> explicitly [..] > > Note that I think that doing things like "io_uring" and getting IO > helper threads that way would very much count as "explicit children", > so I don't argue that all kernel helper threads would fall under this > category. > > And I suspect that the normal vhost workers fall under that same kind > of "it's like io_uring". If you use VHOST_NEW_WORKER to create a > worker thread, then that's a pretty explicit "I have a child process". > > So it's really just that hugepage recovery thread that seems to be a > bit "too" much of an implicit kernel helper thread that user space > kind of gets accidentally and implicitly just because of a kernel > implementation detail. > > I'm sure the kvm hack to just start it later (at KVM_RUN time?) is > sufficient in practice, but it still feels conceptually iffy to me. I don't think implicit vs explicit is right question. Rather we should be asking can userspace care? If I read the context from the commit correctly what userspace is asking is: Am I single threaded so that I know nothing funny will happen in the forked process. The most common funny I am aware of for forked multi-threaded processes is that if they fork with another thread holding a lock the forked process might hang forever on the lock because the lock will never be released. The most interesting part of the hugepage reaper appears to be kvm_mmu_commit_zap_page, where a page is freed after being flushed from the tlb. I would argue that if kvm_mmu_commit_zap_page and friends change the page tables in a way that userspace can see after a fork, and in turn could affect how the forked process will execute userspace is doing something sensible in testing for it. On the flip side if this isn't something userspace can observe in it's own process I would argue that the proper solution is to user a regular kthread. In summary the conceptually clean approach is to only have threads that when running can effect the process they are a part of in a userspace visible way. Assuming the hugepage reaper can effect the process it is a part of, the only problem I see is the hugepage reaper existing when it had nothing it could possibly do. I don't think hiding threads is a useful solution because the threads will effect they process they are a part of. If the threads aren't effecting the process they are a part of we have other solutions besides threads. Eric