On 2/7/2023 1:35 AM, David Hildenbrand wrote: > On 06.02.23 18:10, Matthew Wilcox wrote: >> On Mon, Feb 06, 2023 at 05:49:20PM +0100, David Hildenbrand wrote: >>> We have >>> >>> + if (!cow) { >>> + folio_add_file_rmap_range(folio, start, nr, vma, false); >>> + add_mm_counter(vma->vm_mm, mm_counter_file(page), nr); >>> + } else { >>> + /* >>> + * rmap code is not ready to handle COW with anonymous >>> + * large folio yet. Capture and warn if large folio >>> + * is given. >>> + */ >>> + VM_WARN_ON_FOLIO(folio_test_large(folio), folio); >>> + } >>> >>> now. >>> >>> What are we supposed to add instead on the else branch instead that would be >>> correct in the future? Or not look weird? >> >> Right now, I think this patch should look something like this. >> >> diff --git a/mm/memory.c b/mm/memory.c >> index 7a04a1130ec1..2f6173f83d8b 100644 >> --- a/mm/memory.c >> +++ b/mm/memory.c >> @@ -4257,15 +4257,18 @@ vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page) >> } >> #endif >> -void do_set_pte(struct vm_fault *vmf, struct page *page, unsigned long addr) >> +void set_pte_range(struct vm_fault *vmf, struct folio *folio, >> + struct page *page, unsigned int nr, unsigned long addr) >> { >> struct vm_area_struct *vma = vmf->vma; >> bool uffd_wp = pte_marker_uffd_wp(vmf->orig_pte); >> bool write = vmf->flags & FAULT_FLAG_WRITE; >> bool prefault = vmf->address != addr; >> pte_t entry; >> + unsigned int i; >> - flush_icache_page(vma, page); >> + for (i = 0; i < nr; i++) >> + flush_icache_page(vma, page + i); >> entry = mk_pte(page, vma->vm_page_prot); >> if (prefault && arch_wants_old_prefaulted_pte()) >> @@ -4279,14 +4282,15 @@ void do_set_pte(struct vm_fault *vmf, struct page *page, unsigned long addr) >> entry = pte_mkuffd_wp(entry); >> /* copy-on-write page */ >> if (write && !(vma->vm_flags & VM_SHARED)) { >> - inc_mm_counter(vma->vm_mm, MM_ANONPAGES); >> - page_add_new_anon_rmap(page, vma, addr); >> - lru_cache_add_inactive_or_unevictable(page, vma); >> + add_mm_counter(vma->vm_mm, MM_ANONPAGES, nr); >> + VM_BUG_ON_FOLIO(folio, nr != 1); > > ^ what I asked for (WARN would be sufficient for IMHO). I don't precisely care how precisely we tell the educated reader that this function only handles this special case (I could even be convinced that a comment is good enough ;) ). David, Thanks. I am going to move the cow and !cow case to set_pte_range() and WARN if the large folio is passed in. > >> + folio_add_new_anon_rmap(folio, vma, addr); >> + folio_add_lru_vma(folio, vma); >> } else { >> - inc_mm_counter(vma->vm_mm, mm_counter_file(page)); >> - page_add_file_rmap(page, vma, false); >> + add_mm_counter(vma->vm_mm, mm_counter_file(page), nr); >> + folio_add_file_rmap_range(folio, page, nr, vma, false); >> } >> - set_pte_at(vma->vm_mm, addr, vmf->pte, entry); >> + set_ptes(vma->vm_mm, addr, vmf->pte, entry, nr); Matthew, I think we can't do this set_ptes() until pte_next() is ready. I will make it still a loop to generate correct entry. And we can replace the loop with set_ptes() once the pte_next() is ready. Let me know if I get something wrong. Thanks. Regards Yin, Fengwei >> } >> static bool vmf_pte_changed(struct vm_fault *vmf) >> @@ -4359,7 +4363,9 @@ vm_fault_t finish_fault(struct vm_fault *vmf) >> /* Re-check under ptl */ >> if (likely(!vmf_pte_changed(vmf))) { >> - do_set_pte(vmf, page, vmf->address); >> + struct folio *folio = page_folio(page); >> + >> + set_pte_range(vmf, folio, page, 1, vmf->address); >> /* no need to invalidate: a not-present page won't be cached */ >> update_mmu_cache(vma, vmf->address, vmf->pte); >> >>> Go on, scream louder at me, I don't care. >> >> I'm not even shouting. I just think you're wrong ;-) >> > > Good ;) >