On 02.03.22 17:55, Jason Gunthorpe wrote: > On Thu, Feb 24, 2022 at 01:26:13PM +0100, David Hildenbrand wrote: >> Whenever GUP currently ends up taking a R/O pin on an anonymous page that >> might be shared -- mapped R/O and !PageAnonExclusive() -- any write fault >> on the page table entry will end up replacing the mapped anonymous page >> due to COW, resulting in the GUP pin no longer being consistent with the >> page actually mapped into the page table. >> >> The possible ways to deal with this situation are: >> (1) Ignore and pin -- what we do right now. >> (2) Fail to pin -- which would be rather surprising to callers and >> could break user space. >> (3) Trigger unsharing and pin the now exclusive page -- reliable R/O >> pins. > Hi Jason, > How does this mesh with the common FOLL_FORCE|FOLL_WRITE|FOLL_PIN > pattern used for requesting read access? Can they be converted to > just FOLL_WRITE|FOLL_PIN after this? Interesting question, I thought about this in detail yet, let me give it a try: IIRC, the sole purpose of FOLL_FORCE in the context of R/O pins is to enforce the eventual COW -- meaning we COW (via FOLL_WRITE) even if we don't have the permissions to write (via FOLL_FORCE), to make sure we most certainly have an exclusive anonymoous page in a MAP_PRIVATE mapping. Dropping only the FOLL_FORCE would make the FOLL_WRITE request fail if the mapping is currently !VM_WRITE (but is VM_MAYWRITE), so that wouldn't work. I recall that we don't allow pinning the zero page ("special pte", !vm_normal_page()). So if you have an ordinary MAP_PRIVATE|MAP_ANON mapping, you will now only need a "FOLL_READ" and have a reliable pin, even if not previously writing to every page. It would we different with other MAP_PRIVATE file mappings I remember: With FOLL_FORCE|FOLL_WRITE|FOLL_PIN we'd force placement of an anonymous page, resulting in the R/O (long-term ?) pin not observing consecutive file changes. With a pure FOLL_READ we'd still observe file changes as we don't trigger a write fault. BUT, once we actually write to the private mapping via the page table, the GUP pin would go out of sync with the now-anonymous page mapped into the page table. However, I'm having a hard time answering what's actually expected? It's really hard to tell what the user wants with MAP_PRIVATE file mappings and stumbles over a !anon page (no modifications so far): (a) I want a R/O pin to observe file modifications. (b) I want the R/O pin to *not* observe file modifications but observe my (eventual? if any) private modifications, Of course, if we already wrote to that page and now have an anon page, it's easy: we are already no longer following file changes. Maybe FOLL_PIN would already do now what we'd expect from a R/O pin -- (a), maybe not. I'm wondering if FOLL_LONGTERM could give us an indication whether (a) or (b) applies. -- Thanks, David / dhildenb