On 5/29/2021 1:15 PM, Hugh Dickins wrote:
NOFAULT? Does BSD use "fault" differently, and in Linux terms we would say NOSIGBUS to mean the same? Can someone point to a specification of BSD's __MAP_NOFAULT? Searching just found me references to bugs.
Checked freebsd and openbsd, their MAP_NOFAULT seems quite different than NOSIGBUS. freebsd: https://github.com/freebsd/freebsd-src MAP_NOFAULT: The mapping should not generate page faults openbsd: https://github.com/openbsd/src __MAP_NOFAULT only makes sense with a backing object
What mainly worries me about the suggestion is: what happens to the zero page inserted into NOFAULT mappings, when later a page for that offset is created and added to page cache? Treating it as an opaque blob of zeroes, that stays there ever after, hiding the subsequent data: easy to implement, but a hack that we would probably regret. (And I notice that even the quote from David Herrmann in the original post allows for the possibility that client may want to expand the object.)
Yes, that's problem ...
I believe the correct behaviour would be to unmap the nofault page then, allowing the proper page to be faulted in after. That is certainly doable (the old mm/filemap_xip.c used to do so), but might get into some awkward race territory, with filesystem dependence (reminiscent of hole punch, in reverse). shmem could operate that way, and be the better for it: but I wouldn't want to add that, without also cleaning away all the shmem_recalc_inode() stuff.
After we treat it as zero page, then no page fault for later read. What is the timing to unmap the nofault page? I'm reading filemap_xip.c to learn how to do it. https://elixir.bootlin.com/linux/v3.19.8/source/mm/filemap_xip.c