On Mon, Jun 21, 2021 at 01:54:56PM +0000, Al Viro wrote: > On Tue, Jun 15, 2021 at 02:58:12PM -0700, Linus Torvalds wrote: > > > And I think our horrible "kernel threads return to user space when > > done" is absolutely horrifically nasty. Maybe of the clever sort, but > > mostly of the historical horror sort. > > How would you prefer to handle that, then? Separate magical path from > kernel_execve() to switch to userland? We used to have something of > that sort, and that had been a real horror... > > As it is, it's "kernel thread is spawned at the point similar to > ret_from_fork(), runs the payload (which almost never returns) and > then proceeds out to userland, same way fork(2) would've done." > That way kernel_execve() doesn't have to do anything magical. > > Al, digging through the old notes and current call graph... There's a large mess around do_exit() - we have a bunch of callers all over arch/*; if nothing else, I very much doubt that really want to let tracer play with a thread in the middle of die_if_kernel() or similar. We sure as hell do not want to arrange for anything on the kernel stack in such situations, no matter what's done in exit(2)...