Hello, On Wed, Mar 02, 2016 at 12:48:46AM +0000, Al Viro wrote: > On Tue, Mar 01, 2016 at 12:06:49PM -0800, Linus Torvalds wrote: > > > So the only access we really care about is the child tid-pointer > > clearing one, and that always happens after PF_EXITING has been set > > afaik. > > > > No other case really matters. If somebody accesses a userfault region > > just as another thread is exiting, we don't care. I don't think it > > would necessarily be wrong to ignore the fault, but I don't think it's > > relevant either, since at that stage the normal "you can signal the > > thread" still works. It's only the child tid access that comes *after* > > we have stopped acceping signals, and that's marked by that > > PF_EXITING. > > > > Or maybe I misunderstood your worry entirely or missed something, and > > my answer above is entirely beside your point. Did you have something > > else in mind? > > No, I've misread de_thread()/zap_other_threads(). No objections to the > patch. I reviewed the fix (a) too and it's sure fine with me too. So I evaluated if we could fix the deadlock by simply preventing to run userland page faults while SIGKILL cannot be delivered anymore. I think skipping those userland accesses wouldn't be strictly required if they were run by a legit app with a userfaultfd manager thread alive and well. Running page faults that late in the exit path with signal disabled was frankly unexpected. So I did a quick attempt to test such an exit-code reordering change, but it's overall more risky and so far I didn't succeed to have a SIGKILL reach handle_userfault() despite I cleaned up that futex code into a proper futex_exit run just before exit_signals instead of inside mm_release. Apparently it's not just PF_EXITING that prevents SIGKILL to reach handle_userfault(). The below change still didn't allow to kill the task: + exit_futex(tsk); /* run before setting PF_EXITING */ exit_signals(tsk); /* sets PF_EXITING */ So again I think for now fix (a) is preferable and I verified it too with the test program. As far as the primary production user of userfaulfd is concerned, no futex will run in the userfaultfd tracked region so no matter what the futex exit code is there for, there's no risk of memory corruption. Thanks, Andrea -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html