Al Viro <viro@xxxxxxxxxxxxxxxxxx> writes: > IMO the right way to handle that would be > 1) turn these two do_exit() into do_exit(0), to reduce > confusion > 2) deal with all do_exit() in kthread payloads. Your > name for the primitive is fine, IMO. > 3) make that primitive pass the return value by way of > a field in struct kthread, adjusting kthread_stop() accordingly > and passing 0 to do_exit() in kthread_exit() itself. > > (2) is not as trivial as you seem to hope, though. Your patches > in drivers/staging/rt*/ had papered over the problem in there, > but hadn't really solved it. > > thread_exit() should've been shot, all right, but it really ought > to have been complete_and_exit() there. The thing is, complete() > + return does *not* guarantee that driver won't get unloaded before > the thread terminates. Possibly freeing its .code and leaving > a thread to resume running in there as soon as it regains CPU. > > The point of complete_and_exit() is that it's noreturn *and* in > core kernel. So it can be safely used in a modular kthread, > if paired with wait_for_completion() in or before module_exit. > complete() + do_exit() (or complete + return as you've gotten > there) doesn't give such guarantees at all. I think we are mostly in agreement here. There are kernel threads started by modules that do: complete(...); return 0; That should be at a minimum calling complete_and_exit. Possibly should be restructured to use kthread_stop(). Some of those users of the now removed thread_exit() in staging are among the offenders. However thread_exit() was implemented as: #define thread_exit() complete_and_exit(NULL, 0) Which does nothing with a completion, it was just a really funny way to spell "do_exit(0)". While I agree digging through all of the kernel threads and finding the ones that should be calling complete_and_exit is a fine idea. It is a concern independent of these patches. > I'm (re)crawling through that zoo right now, will post when > I get more details. Eric