On Tue, Apr 27, 2021 at 09:51:58AM -0700, Linus Torvalds wrote: > On Tue, Apr 27, 2021 at 1:25 AM Simon Ser <contact@xxxxxxxxxxx> wrote: > > > > Rather than requiring changes in all compositors *and* clients, can we > > maybe only require changes in compositors? For instance, OpenBSD has a > > __MAP_NOFAULT flag. When passed to mmap, it means that out-of-bound > > accesses will read as zeroes instead of triggering SIGBUS. Such a flag > > would be very helpful to unblock the annoying SIGBUS situation. > > > > Would something among these lines be welcome in the Linux kernel? > > Hmm. It doesn't look too hard to do. The biggest problem is actually > that we've run out of flags in the vma (on 32-bit architectures), but > you could try this UNTESTED patch that just does the MAP_NOFAULT thing > unconditionally. > > NOTE! Not only is it untested, not only is this a "for your testing > only" (because it does it unconditionally rather than only for > __MAP_NOFAULT), but it might be bogus for other reasons. In > particular, this patch depends on "vmf->address" not being changed by > the ->fault() infrastructure, so that we can just re-use the vmf for > the anonymous case if we get a SIGBUS. > > I think that's all ok these days, because Kirill and Peter Xu cleaned > up those paths, but I didn't actually check. So I'm cc'ing Kirill, > Peter and Will, who have been working in this area for other reasons > fairly recently. > > Side note: this will only ever work for non-shared mappings. I think it's show-stopper for the use-case, no? IIUC, the mappings is used for communication between a compositor and a client and has to be shared. > That's fundamental. We won't add an anonymous page to a shared mapping, > and do_anonymous_page() does verify that. So a MAP_SHARED mappign will > still return SIGBUS even with this patch (although it's not obvious from > the patch - the VM_FAULT_SIGBUS will just be re-created by > do_anonymous_page()). > > So if you want a _shared_ mapping to honor __MAP_NOFAULT and insert > random anonymous pages into it, I think the answer is "no, that's not > going to be viable". + Matthew, Dan. DAX uses zero pages in page cache to avoid allocating backing storage read accesses to holes. Maybe we can generalize it beyond DAX to any page cache and add a (per-inode?) flag to do the same for accesses beyond i_size? > So _if_ this works for you, and if it's ok that only MAP_PRIVATE can > have __MAP_NOFAULT, and if Kirill/Peter/Will don't say "Oh, Linus, > you're completely off your rocker and clearly need to be taking your > meds", something like this - if we figure out the conditional bit - > might be doable. > > That's a fair number of "ifs". > > Ok, back to the merge window for me, I'll be throwing away this crazy > untested patch immediately after hitting "send". This is very much a > "throw the idea over to other people" patch, in other words. > > Linus -- Kirill A. Shutemov