On Mon, Jan 11, 2021 at 2:18 PM Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > > On Mon, Jan 11, 2021 at 11:19 AM Linus Torvalds > <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > Actually, what I think might be a better model is to actually > strengthen the rules even more, and get rid of GUP_PIN_COUNTING_BIAS > entirely. > > What we could do is just make a few clear rules explicit (most of > which we already basically hold to). Starting from that basic > > (a) Anonymous pages are made writable (ie COW) only when they have a > page_count() of 1 Seems reasonable to me. > > That very simple rule then automatically results in the corollary > > (b) a writable page in a COW mapping always starts out reachable > _only_ from the page tables Seems reasonable. I guess that if the COW is triggered by GUP, then it starts out reachable only from the page tables but then because reachable through GUP very soon thereafter. > > and now we could have a couple of really simple new rules: > > (c) we never ever make a writable page in a COW mapping read-only > _unless_ it has a page_count() of 1 I don't love this. Having mprotect() fail in a multithreaded process because another thread happens to be doing a short-lived IO seems like it may result in annoying intermittent bugs. As I understand it, the issue is that the way we determine that we need to COW a COWable page is that we see that it's read-only. It would be nice if we could separately track "the VMA allows writes" and "this PTE points to a page that is private to the owning VMA", but maybe there's no bit available for the latter other than looking at RO vs RW directly. > > (d) we never create a swap cache page out of a writable COW mapping page > > Now, if you combine these rules, the whole need for the > GUP_PIN_COUNTING_BIAS basically goes away. > > Why? Because we know that the _only_ thing that can elevate the > refcount of a writable COW page is GUP - we'll just make sure nothing > else touches it. How common is !FOLL_WRITE GUP? We could potentially say that a short-term !FOLL_WRITE GUP is permitted on an RO COW page and that a subsequent COW on the page will wait for the GUP to go away. This might be too big a can of worms for the benefit it would provide, though. --Andy