hi, Ryan Roberts 在 2024/3/12 20:34, Ryan Roberts 写道: > On 04/03/2024 08:13, Barry Song wrote: >> From: Chuanhua Han <hanchuanhua@xxxxxxxx> >> >> should_try_to_free_swap() works with an assumption that swap-in is always done >> at normal page granularity, aka, folio_nr_pages = 1. To support large folio >> swap-in, this patch removes the assumption. >> >> Signed-off-by: Chuanhua Han <hanchuanhua@xxxxxxxx> >> Co-developed-by: Barry Song <v-songbaohua@xxxxxxxx> >> Signed-off-by: Barry Song <v-songbaohua@xxxxxxxx> >> Acked-by: Chris Li <chrisl@xxxxxxxxxx> >> --- >> mm/memory.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/mm/memory.c b/mm/memory.c >> index abd4f33d62c9..e0d34d705e07 100644 >> --- a/mm/memory.c >> +++ b/mm/memory.c >> @@ -3837,7 +3837,7 @@ static inline bool should_try_to_free_swap(struct folio *folio, >> * reference only in case it's likely that we'll be the exlusive user. >> */ >> return (fault_flags & FAULT_FLAG_WRITE) && !folio_test_ksm(folio) && >> - folio_ref_count(folio) == 2; >> + folio_ref_count(folio) == (1 + folio_nr_pages(folio)); > I don't think this is correct; one reference has just been added to the folio in > do_swap_page(), either by getting from swapcache (swap_cache_get_folio()) or by > allocating. If it came from the swapcache, it could be a large folio, because we > swapped out a large folio and never removed it from swapcache. But in that case, > others may have partially mapped it, so the refcount could legitimately equal > the number of pages while still not being exclusively mapped. > > I'm guessing this logic is trying to estimate when we are likely exclusive so > that we remove from swapcache (release ref) and can then reuse rather than CoW > the folio? The main CoW path currently CoWs page-by-page even for large folios, > and with Barry's recent patch, even the last page gets copied. So not sure what > this change is really trying to achieve? > First, if it is a large folio in the swap cache, then its refcont is at least folio_nr_pages(folio) : For example, in add_to_swap_cache path: int add_to_swap_cache(struct folio *folio, swp_entry_t entry, gfp_t gfp, void **shadowp) { struct address_space *address_space = swap_address_space(entry); pgoff_t idx = swp_offset(entry); XA_STATE_ORDER(xas, &address_space->i_pages, idx, folio_order(folio)); unsigned long i, nr = folio_nr_pages(folio); <--- void *old; ... folio_ref_add(folio, nr); <--- folio_set_swapcache(folio); ... } * Then in the do_swap_page path: * if (should_try_to_free_swap(folio, vma, vmf->flags)) folio_free_swap(folio); * * It also indicates that only folio in the swap cache will call folio_free_swap * to delete it from the swap cache, So I feel like this patch is necessary!? 😁 >> } >> >> static vm_fault_t pte_marker_clear(struct vm_fault *vmf) Thanks, Chuanhua