On Tue, Apr 27, 2021 at 1:25 AM Simon Ser <contact@xxxxxxxxxxx> wrote: > > Rather than requiring changes in all compositors *and* clients, can we > maybe only require changes in compositors? For instance, OpenBSD has a > __MAP_NOFAULT flag. When passed to mmap, it means that out-of-bound > accesses will read as zeroes instead of triggering SIGBUS. Such a flag > would be very helpful to unblock the annoying SIGBUS situation. > > Would something among these lines be welcome in the Linux kernel? Hmm. It doesn't look too hard to do. The biggest problem is actually that we've run out of flags in the vma (on 32-bit architectures), but you could try this UNTESTED patch that just does the MAP_NOFAULT thing unconditionally. NOTE! Not only is it untested, not only is this a "for your testing only" (because it does it unconditionally rather than only for __MAP_NOFAULT), but it might be bogus for other reasons. In particular, this patch depends on "vmf->address" not being changed by the ->fault() infrastructure, so that we can just re-use the vmf for the anonymous case if we get a SIGBUS. I think that's all ok these days, because Kirill and Peter Xu cleaned up those paths, but I didn't actually check. So I'm cc'ing Kirill, Peter and Will, who have been working in this area for other reasons fairly recently. Side note: this will only ever work for non-shared mappings. That's fundamental. We won't add an anonymous page to a shared mapping, and do_anonymous_page() does verify that. So a MAP_SHARED mappign will still return SIGBUS even with this patch (although it's not obvious from the patch - the VM_FAULT_SIGBUS will just be re-created by do_anonymous_page()). So if you want a _shared_ mapping to honor __MAP_NOFAULT and insert random anonymous pages into it, I think the answer is "no, that's not going to be viable". So _if_ this works for you, and if it's ok that only MAP_PRIVATE can have __MAP_NOFAULT, and if Kirill/Peter/Will don't say "Oh, Linus, you're completely off your rocker and clearly need to be taking your meds", something like this - if we figure out the conditional bit - might be doable. That's a fair number of "ifs". Ok, back to the merge window for me, I'll be throwing away this crazy untested patch immediately after hitting "send". This is very much a "throw the idea over to other people" patch, in other words. Linus
mm/memory.c | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 550405fc3b5e..bbede6b52f7a 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4312,10 +4312,21 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf) } if (!vmf->pte) { - if (vma_is_anonymous(vmf->vma)) - return do_anonymous_page(vmf); - else - return do_fault(vmf); + if (!vma_is_anonymous(vmf->vma)) { + vm_fault_t ret = do_fault(vmf); + if (ret & VM_FAULT_RETRY) + return ret; + if (!(ret & VM_FAULT_SIGBUS)) + return ret; +/* FIXME! We don't have a VM_NOFAULT bit */ +#if 0 + /* See if we should turn a SIGBUS into an anonymous page */ + if (!(vma->vm_flags & VM_NOFAULT)) + return ret; +#endif +/* Fall back on do_anonymous_page() instead of SIGBUS */ + } + return do_anonymous_page(vmf); } if (!pte_present(vmf->orig_pte))