On Mon, Jun 21, 2021 at 06:59:01PM +0000, Al Viro wrote: > On Mon, Jun 21, 2021 at 01:54:56PM +0000, Al Viro wrote: > > On Tue, Jun 15, 2021 at 02:58:12PM -0700, Linus Torvalds wrote: > > > > > And I think our horrible "kernel threads return to user space when > > > done" is absolutely horrifically nasty. Maybe of the clever sort, but > > > mostly of the historical horror sort. > > > > How would you prefer to handle that, then? Separate magical path from > > kernel_execve() to switch to userland? We used to have something of > > that sort, and that had been a real horror... > > > > As it is, it's "kernel thread is spawned at the point similar to > > ret_from_fork(), runs the payload (which almost never returns) and > > then proceeds out to userland, same way fork(2) would've done." > > That way kernel_execve() doesn't have to do anything magical. > > > > Al, digging through the old notes and current call graph... > > There's a large mess around do_exit() - we have a bunch of > callers all over arch/*; if nothing else, I very much doubt that really > want to let tracer play with a thread in the middle of die_if_kernel() > or similar. > > We sure as hell do not want to arrange for anything on the kernel > stack in such situations, no matter what's done in exit(2)... FWIW, on alpha it's die_if_kernel(), do_entUna() and do_page_fault(), all in not-from-userland cases. On m68k - die_if_kernel(), do_page_fault() (both for non-from-userland cases) and something really odd - fpsp040_die(). Exception handling for floating point stuff on 68040? Looks like it has an open-coded copy_to_user()/copy_from_user(), with faults doing hard do_exit(SIGSEGV) instead of raising a signal and trying to do something sane... I really don't want to try and figure out how painful would it be to teach that code how to deal with faults - _testing_ anything in that area sure as hell will be. IIRC, details of recovery from FPU exceptions on 68040 in the manual left impression of a minefield...