On Wed, May 10, 2023 at 3:27 PM Peter Xu <peterx@xxxxxxxxxx> wrote: > > The change looks all right to me. I hate how this path has turned very architecture-specific again, and this patch makes it worse. For a short while we shared a lot of the code between architectures thanks to the new fault helpers (ie fault_signal_pending() and friends). Now CONFIG_PER_VMA_LOCK has made a mockery of that, and this patch makes it worse by just fixing x86. As is historically always the case. And I think that patch is actually buggy. Because it does that ret = VM_FAULT_SIGBUS; if (fpin) goto out_retry; in the fault path, and now returns VM_FAULT_SIGBUS | VM_FAULT_RETRY as a result. And there are a lot of users that will be *very* unhappy about that combination. Look at mm/gup.c, for example. It will do if (ret & VM_FAULT_ERROR) { int err = vm_fault_to_errno(ret, *flags); if (err) return err; ... and not react to VM_FAULT_RETRY being set at all - so it won't clear the "locked" variable, for example. And that all just makes it very clear how much of a painpoint that conditional lock release is. It's horrendous. That's always a huge pain in general, and it really complicates how things get handled. It has very real and obvious historical reasons ("we never released the lock" being converted one case at a time to "we release the lock in this situation and set the bit to tell you"), but I wonder if we should just say "handle_mm_fault() _always_ releases the lock". We'd still keep the RETRY bit as a "this did not complete, you need to retry", but at least the whole secondary meaning of "oh, and if it isn't set, you need to release the lock you took" would go away. Because RETRY has really annoying semantics today. Again - I know exactly _why_ it has those horrendous semantics, and I'm not blaming anybody, but it makes it really painful to deal with 15 different architectures that then have to deal with those things. How painful would it be to try to take baby steps and remove those kinds of things and slowly move towards a situation where RETRY isn't such a magically special bit? Peter Xu, you did a lot of the helper function cleanups, and moved some of the accounting code into generic code. It's a few years ago, but maybe you still remember this area.. And Luto, I'm adding you to the participants because you've touched that code more than most. Could we make the rule be that handle_mm_fault() just *always* releases the lock? How painful would that be? For many of the architectures, I *think* it would mean just removing the mmap_read_unlock(mm); in the fault handling path (after the RETRY test). But there's a couple of other uses of handle_mm_fault() too, notably GUP. Am I just dreaming? Or could we try to simplify this area first? Because I think Johannes' patch right now is quite buggy, and only converted one caller, when a lot of other callers will currently find it very problematic to see both VM_FAULT_RETRY and one of the error bits.. It's also possible that I'm misreading things. But I really think the lock handling would be a lot easier if the rule was "it is locked on entry, it's always unlocked on exit". Of course, with the vma locking, the "it" above is nontrivial too. That sure didn't simplify thinking about this all.... Linus