On Nov 7, 2022, at 11:27 AM, David Hildenbrand <david@xxxxxxxxxx> wrote: > !! External Email > > On 07.11.22 20:03, Nadav Amit wrote: >> On Nov 7, 2022, at 8:17 AM, David Hildenbrand <david@xxxxxxxxxx> wrote: >> >>> !! External Email >>> >>> Let's catch abuse of FAULT_FLAG_WRITE early, such that we don't have to >>> care in all other handlers and might get "surprises" if we forget to do >>> so. >>> >>> Write faults without VM_MAYWRITE don't make any sense, and our >>> maybe_mkwrite() logic could have hidden such abuse for now. >>> >>> Write faults without VM_WRITE on something that is not a COW mapping is >>> similarly broken, and e.g., do_wp_page() could end up placing an >>> anonymous page into a shared mapping, which would be bad. >>> >>> This is a preparation for reliable R/O long-term pinning of pages in >>> private mappings, whereby we want to make sure that we will never break >>> COW in a read-only private mapping. >>> >>> Signed-off-by: David Hildenbrand <david@xxxxxxxxxx> >>> --- >>> mm/memory.c | 8 ++++++++ >>> 1 file changed, 8 insertions(+) >>> >>> diff --git a/mm/memory.c b/mm/memory.c >>> index fe131273217a..826353da7b23 100644 >>> --- a/mm/memory.c >>> +++ b/mm/memory.c >>> @@ -5159,6 +5159,14 @@ static vm_fault_t sanitize_fault_flags(struct vm_area_struct *vma, >>> */ >>> if (!is_cow_mapping(vma->vm_flags)) >>> *flags &= ~FAULT_FLAG_UNSHARE; >>> + } else if (*flags & FAULT_FLAG_WRITE) { >>> + /* Write faults on read-only mappings are impossible ... */ >>> + if (WARN_ON_ONCE(!(vma->vm_flags & VM_MAYWRITE))) >>> + return VM_FAULT_SIGSEGV; >>> + /* ... and FOLL_FORCE only applies to COW mappings. */ >>> + if (WARN_ON_ONCE(!(vma->vm_flags & VM_WRITE) && >>> + !is_cow_mapping(vma->vm_flags))) >>> + return VM_FAULT_SIGSEGV; >> >> Not sure about the WARN_*(). Seems as if it might trigger in benign even if >> rare scenarios, e.g., mprotect() racing with page-fault. > > We most certainly would want to catch any such broken/racy cases. There > are no benign cases I could possibly think of. > > Page faults need the mmap lock in read. mprotect() / VMA changes need > the mmap lock in write. Whoever calls handle_mm_fault() is supposed to > properly check VMA permissions. My bad. I now see it. Thanks for explaining.