On Thu, Jun 13, 2024 at 10:37 PM David Hildenbrand <david@xxxxxxxxxx> wrote: > > On 13.06.24 11:58, Barry Song wrote: > > On Thu, Jun 13, 2024 at 9:23 PM David Hildenbrand <david@xxxxxxxxxx> wrote: > >> > >> On 13.06.24 02:07, Barry Song wrote: > >>> From: Barry Song <v-songbaohua@xxxxxxxx> > >>> > >>> For the !folio_test_anon(folio) case, we can now invoke folio_add_new_anon_rmap() > >>> with the rmap flags set to either EXCLUSIVE or non-EXCLUSIVE. This action will > >>> suppress the VM_WARN_ON_FOLIO check within __folio_add_anon_rmap() while initiating > >>> the process of bringing up mTHP swapin. > >>> > >>> static __always_inline void __folio_add_anon_rmap(struct folio *folio, > >>> struct page *page, int nr_pages, struct vm_area_struct *vma, > >>> unsigned long address, rmap_t flags, enum rmap_level level) > >>> { > >>> ... > >>> if (unlikely(!folio_test_anon(folio))) { > >>> VM_WARN_ON_FOLIO(folio_test_large(folio) && > >>> level != RMAP_LEVEL_PMD, folio); > >>> } > >>> ... > >>> } > >>> > >>> It also enhances the code’s readability. > >>> > >>> Suggested-by: David Hildenbrand <david@xxxxxxxxxx> > >>> Signed-off-by: Barry Song <v-songbaohua@xxxxxxxx> > >>> --- > >>> mm/memory.c | 2 ++ > >>> 1 file changed, 2 insertions(+) > >>> > >>> diff --git a/mm/memory.c b/mm/memory.c > >>> index 2f94921091fb..9c962f62f928 100644 > >>> --- a/mm/memory.c > >>> +++ b/mm/memory.c > >>> @@ -4339,6 +4339,8 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) > >>> if (unlikely(folio != swapcache && swapcache)) { > >>> folio_add_new_anon_rmap(folio, vma, address, RMAP_EXCLUSIVE); > >>> folio_add_lru_vma(folio, vma); > >>> + } else if (!folio_test_anon(folio)) { > >>> + folio_add_new_anon_rmap(folio, vma, address, rmap_flags); > >> > >> So, with large folio swapin, we would never end up here if any swp PTE > >> is !exclusive, because we would make sure to allocate a large folio only > >> for suitable regions, correct? > >> > >> It can certainly happen during swapout + refault with large folios. But > >> there, we will always have folio_test_anon() still set and wouldn't run > >> into this code path. > >> > >> (it will all be better with per-folio PAE bit, but that will take some > >> more time until I actually get to implement it properly, handling all > >> the nasty corner cases) > >> > >> Better add a comment regarding why we are sure that the complete thing > >> is exclusive (e.g., currently only small folios). > > > > No, patch 1/3 enables both cases to call folio_add_new_anon_rmap(). Currently, > > small folios could be non-exclusive. I suppose your question is > > whether we should > > ensure that all pages within a mTHP have the same exclusivity status, > > rather than > > always being exclusive? > > We must never end up calling folio_add_new_anon_rmap(folio, vma, > address, RMAP_EXCLUSIVE) if part of the folio is non-exclusive. Agreed. > > I think we *may* call folio_add_new_anon_rmap(folio, vma, address, 0) if > part of the folio is exclusive [it go swapped out, so there cannot be > GUP references]. But, to be future proof (single PAE bit per folio), we > better avoid that on swapin if possible. > > For small folios, it's clear that it cannot be partially exclusive > (single page). Yes, for small folios, it is much simpler; they are either entirely exclusive or entirely non-exclusive. For the initial swapin patch[1], which only supports SYNC-IO mTHP swapin, mTHP is always entirely exclusive. I am also considering non-SYNCHRONOUS_IO swapcache-based mTHP swapin but intend to start with the entirely exclusive cases. [1] https://lore.kernel.org/linux-mm/20240304081348.197341-6-21cnbao@xxxxxxxxx/ > > We better comment why we obey to the above. Like > > "We currently ensure that new folios cannot be partially exclusive: they > are either fully exclusive or fully shared." > > -- > Cheers, > > David / dhildenb > Thanks Barry