On Thu, Dec 02, 2021 at 10:54:48AM -0800, Linus Torvalds wrote: > On Wed, Dec 1, 2021 at 8:11 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > > > > The other patch we've been kicking around (and works) is: > > > > static inline bool should_force_cow_break(struct vm_area_struct *vma, unsigned > > int flags) > > { > > - return is_cow_mapping(vma->vm_flags) && (flags & FOLL_GET); > > + return is_cow_mapping(vma->vm_flags) && > > + (!(vma->vm_flags & VM_DENYWRITE)) && (flags & FOLL_GET); > > } > > That patch makes no sense to me. > > It may "work", but it doesn't actually do anything sensible or really > fix the problem that I can tell. Oh absolutely, it's semantically nonsense. The only reason it fixes the problem is that VM_DENYWRITE VMAs are the only ones considered for the RO_THP merging, so they're the only ones which we've seen causing a problem. > I suspect a real fix would be bigger and more invasive. Darn. I was hoping you were going to say something like "The real problem is follow_trans_huge_pmd() is complete garbage and it should just do X, Y and Z". Or "When we force on FOLL_WRITE, we should also force on FOLL_SPLIT_PMD". > If the answer is not to backport all the other changes (and they were > _really_ invasive), I think one answer may be to simply move the > "should_force_cow_break()" down to below the point where you've looked > up the page. > > Then you can actually look at "is this a file mapped page", and say > "if so, that's ok, we can return it as-is". > > Otherwise, you do something like > > foll_flags |= FOLL_WRITE; > free_page(page); > goto repeat; > > to repeat the loop (now with FOLL_WRITE). > > So the patch is bigger and more involved, because you would have done > the page lookup (for reading) and now notice "Oh, I need it for > writing instead" so you need to undo and re-do). > > But at least - unlike backporting everything else - it would be > limited to that one __get_user_pages() function. > > Hmm? > > (And you'd need to handle that follow_hugetlb_page() case too), not > just the follow_page_mask() one) Thanks, I'll take a look.