On 09.08.22 22:00, Linus Torvalds wrote: > On Mon, Aug 8, 2022 at 12:32 AM David Hildenbrand <david@xxxxxxxxxx> wrote: >> > > So I've read through the patch several times, and it seems fine, but > this function (and the pmd version of it) just read oddly to me. > >> +static inline bool can_follow_write_pte(pte_t pte, struct page *page, >> + struct vm_area_struct *vma, >> + unsigned int flags) >> +{ >> + if (pte_write(pte)) >> + return true; >> + if (!(flags & FOLL_FORCE)) >> + return false; >> + >> + /* >> + * See check_vma_flags(): only COW mappings need that special >> + * "force" handling when they lack VM_WRITE. >> + */ >> + if (vma->vm_flags & VM_WRITE) >> + return false; >> + VM_BUG_ON(!is_cow_mapping(vma->vm_flags)); > > So apart from the VM_BUG_ON(), this code just looks really strange - > even despite the comment. Just conceptually, the whole "if it's > writable, return that you cannot follow it for a write" just looks so > very very strange. > > That doesn't make the code _wrong_, but considering how many times > this has had subtle bugs, let's not write code that looks strange. > > So I would suggest that to protect against future bugs, we try to make > it be fairly clear and straightforward, and maybe even a bit overly > protective. > > For example, let's kill the "shared mapping that you don't have write > permissions to" very explicitly and without any subtle code at all. > The vm_flags tests are cheap and easy, and we could very easily just > add some core ones to make any mistakes much less critical. > > Now, making that 'is_cow_mapping()' check explicit at the very top of > this would already go a long way: > > /* FOLL_FORCE for writability only affects COW mappings */ > if (!is_cow_mapping(vma->vm_flags)) > return false; I actually put the is_cow_mapping() mapping check in there because check_vma_flags() should make sure that we cannot possibly end up here in that case. But we can spell it out with comments, doesn't hurt. > > but I'd actually go even further: in this case that "is_cow_mapping()" > helper to some degree actually hides what is going on. > > So I'd actually prefer for that function to be written something like > > /* If the pte is writable, we can write to the page */ > if (pte_write(pte)) > return true; > > /* Maybe FOLL_FORCE is set to override it? */ > if (flags & FOLL_FORCE) > return false; > > /* But FOLL_FORCE has no effect on shared mappings */ > if (vma->vm_flags & MAP_SHARED) > return false; > > /* .. or read-only private ones */ > if (!(vma->vm_flags & MAP_MAYWRITE)) > return false; > > /* .. or already writable ones that just need to take a write fault */ > if (vma->vm_flags & MAP_WRITE) > return false; > > and the two first vm_flags tests above are basically doing tat > "is_cow_mapping()", and maybe we could even have a comment to that > effect, but wouldn't it be nice to just write it out that way? > > And after you've written it out like the above, now that > > if (!page || !PageAnon(page) || !PageAnonExclusive(page)) > return false; > > makes you pretty safe from a data sharing perspective: it's most > definitely not a shared page at that point. > > So if you write it that way, the only remaining issues are the magic > special soft-dirty and uffd ones, but at that point it's purely about > the semantics of those features, no longer about any possible "oh, we > fooled some shared page to be writable". > > And I think the above is fairly legible without any subtle cases, and > the one-liner comments make it all fairly clear that it's testing. > > Is any of this in any _technical_ way different from what your patch > did? No. It's literally just rewriting it to be a bit more explicit in > what it is doing, I think, and it makes that odd "it's not writable if > VM_WRITE is set" case a bit more explicit. > > Hmm? No strong opinion. I'm happy as long as it's fixed, and the fix is robust. -- Thanks, David / dhildenb