On Thu, Jun 20, 2024 at 8:49 PM David Hildenbrand <david@xxxxxxxxxx> wrote: > > On 20.06.24 10:33, Barry Song wrote: > > On Thu, Jun 20, 2024 at 7:46 PM David Hildenbrand <david@xxxxxxxxxx> wrote: > >> > >> On 18.06.24 01:11, Barry Song wrote: > >>> From: Barry Song <v-songbaohua@xxxxxxxx> > >>> > >>> For the !folio_test_anon(folio) case, we can now invoke folio_add_new_anon_rmap() > >>> with the rmap flags set to either EXCLUSIVE or non-EXCLUSIVE. This action will > >>> suppress the VM_WARN_ON_FOLIO check within __folio_add_anon_rmap() while initiating > >>> the process of bringing up mTHP swapin. > >>> > >>> static __always_inline void __folio_add_anon_rmap(struct folio *folio, > >>> struct page *page, int nr_pages, struct vm_area_struct *vma, > >>> unsigned long address, rmap_t flags, enum rmap_level level) > >>> { > >>> ... > >>> if (unlikely(!folio_test_anon(folio))) { > >>> VM_WARN_ON_FOLIO(folio_test_large(folio) && > >>> level != RMAP_LEVEL_PMD, folio); > >>> } > >>> ... > >>> } > >>> > >>> It also improves the code’s readability. Currently, all new anonymous > >>> folios calling folio_add_anon_rmap_ptes() are order-0. This ensures > >>> that new folios cannot be partially exclusive; they are either entirely > >>> exclusive or entirely shared. > >>> > >>> Suggested-by: David Hildenbrand <david@xxxxxxxxxx> > >>> Signed-off-by: Barry Song <v-songbaohua@xxxxxxxx> > >>> Tested-by: Shuai Yuan <yuanshuai@xxxxxxxx> > >>> --- > >>> mm/memory.c | 8 ++++++++ > >>> mm/swapfile.c | 13 +++++++++++-- > >>> 2 files changed, 19 insertions(+), 2 deletions(-) > >>> > >>> diff --git a/mm/memory.c b/mm/memory.c > >>> index 1f24ecdafe05..620654c13b2f 100644 > >>> --- a/mm/memory.c > >>> +++ b/mm/memory.c > >>> @@ -4339,6 +4339,14 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) > >>> if (unlikely(folio != swapcache && swapcache)) { > >>> folio_add_new_anon_rmap(folio, vma, address, RMAP_EXCLUSIVE); > >>> folio_add_lru_vma(folio, vma); > >>> + } else if (!folio_test_anon(folio)) { > >>> + /* > >>> + * We currently only expect small !anon folios, for which we now > >>> + * that they are either fully exclusive or fully shared. If we > >>> + * ever get large folios here, we have to be careful. > >>> + */ > >>> + VM_WARN_ON_ONCE(folio_test_large(folio)); > >>> + folio_add_new_anon_rmap(folio, vma, address, rmap_flags); > >>> } else { > >>> folio_add_anon_rmap_ptes(folio, page, nr_pages, vma, address, > >>> rmap_flags); > >>> diff --git a/mm/swapfile.c b/mm/swapfile.c > >>> index ae1d2700f6a3..69efa1a57087 100644 > >>> --- a/mm/swapfile.c > >>> +++ b/mm/swapfile.c > >>> @@ -1908,8 +1908,17 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, > >>> VM_BUG_ON_FOLIO(folio_test_writeback(folio), folio); > >>> if (pte_swp_exclusive(old_pte)) > >>> rmap_flags |= RMAP_EXCLUSIVE; > >>> - > >>> - folio_add_anon_rmap_pte(folio, page, vma, addr, rmap_flags); > >>> + /* > >>> + * We currently only expect small !anon folios, for which we now that > >>> + * they are either fully exclusive or fully shared. If we ever get > >>> + * large folios here, we have to be careful. > >>> + */ > >>> + if (!folio_test_anon(folio)) { > >>> + VM_WARN_ON_ONCE(folio_test_large(folio)); > >> > >> (comment applies to both cases) > >> > >> Thinking about Hugh's comment, we should likely add here: > >> > >> VM_WARN_ON_FOLIO(!folio_test_locked(folio), folio); > >> > >> [the check we are removing from __folio_add_anon_rmap()] > >> > >> and document for folio_add_new_anon_rmap() in patch #1, that when > >> dealing with folios that might be mapped concurrently by others, the > >> folio lock must be held. > > > > I assume you mean something like the following for patch#1? > > > > diff --git a/mm/rmap.c b/mm/rmap.c > > index df1a43295c85..20986b25f1b2 100644 > > --- a/mm/rmap.c > > +++ b/mm/rmap.c > > @@ -1394,7 +1394,8 @@ void folio_add_anon_rmap_pmd(struct folio > > *folio, struct page *page, > > * > > * Like folio_add_anon_rmap_*() but must only be called on *new* folios. > > * This means the inc-and-test can be bypassed. > > - * The folio does not have to be locked. > > + * The folio doesn't necessarily need to be locked while it's > > exclusive unless two threads > > + * map it concurrently. However, the folio must be locked if it's shared. > > * > > * If the folio is pmd-mappable, it is accounted as a THP. > > */ > > @@ -1406,6 +1407,7 @@ void folio_add_new_anon_rmap(struct folio > > *folio, struct vm_area_struct *vma, > > int nr_pmdmapped = 0; > > > > VM_WARN_ON_FOLIO(folio_test_hugetlb(folio), folio); > > + VM_WARN_ON_FOLIO(!exclusive && !folio_test_locked(folio), folio); > > For now this would likely do. I was concerned about a concurrent > scenario in the exclusive case, but that shouldn't really happen I guess. > Since this is primarily a documentation update, I'll wait for two or three days to see if there are any more Linux-next reports before sending v3 combining these fixes together(I've already fixed another doc warn reported by lkp) and seek Andrew's assistance to drop v2 and apply v3. > -- > Cheers, > > David / dhildenb > Thanks Barry