On Thu, Oct 28, 2021 at 2:21 PM Catalin Marinas <catalin.marinas@xxxxxxx> wrote: > > They do look fairly similar but we should have the information in the > fault handler to distinguish: not a page fault (pte permission or p*d > translation), in_task(), user address, fixup handler. But I agree the > logic looks fragile. So thinking about this a bit more, I think I have a possible suggestion for how to handle this.. The pointer color fault (or whatever some other architecture may do to generate sub-page faults) is not only not recoverable in the sense that we can't fix it up, it also ends up being a forced SIGSEGV (ie it can't be blocked - it has to either be caught or cause the process to be killed). And the thing is, I think we could just make the rule be that kernel code that has this kind of retry loop with fault_in_pages() would force an EFAULT on a pending SIGSEGV. IOW, the pending SIGSEGV could effectively be exactly that "thread flag". And that means that fault_in_xyz() wouldn't need to worry about this situation at all: the regular copy_from_user() (or whatever flavor it is - to/from/iter/whatever) would take the fault. And if it's a regular page fault,. it would act exactly like it does now, so no changes. If it's a sub-page fault, we'd just make the rule be that we send a SIGSEGV even if the instruction in question has a user exception fixup. Then we just need to add the logic somewhere that does "if active pending SIGSEGV, return -EFAULT". Of course, that logic might be in fault_in_xyz(), but it migth also be a separate function entirely. So this does effectively end up being a thread flag, but it's also slightly more than that - it's that a sub-page fault from kernel mode has semantics that a regular page fault does not. The whole "kernel access doesn't cause SIGSEGV, but returns -EFAULT instead" has always been an odd and somewhat wrong-headed thing. Of course it should cause a SIGSEGV, but that's not how Unix traditionall worked. We would just say "color faults always raise a signal, even if the color fault was triggered in a system call". (And I didn't check: I say "SIGSEGV" above, but maybe the pointer color faults are actually SIGBUS? Doesn't change the end result). Linus