On 05/02/2024 09:51, Barry Song wrote: > +Chris, Suren and Chuanhua > > Hi Ryan, > >> + /* >> + * __scan_swap_map_try_ssd_cluster() may drop si->lock during discard, >> + * so indicate that we are scanning to synchronise with swapoff. >> + */ >> + si->flags += SWP_SCANNING; >> + ret = __scan_swap_map_try_ssd_cluster(si, &offset, &scan_base, order); >> + si->flags -= SWP_SCANNING; > > nobody is using this scan_base afterwards. it seems a bit weird to > pass a pointer. > >> --- a/mm/vmscan.c >> +++ b/mm/vmscan.c >> @@ -1212,11 +1212,13 @@ static unsigned int shrink_folio_list(struct list_head *folio_list, >> if (!can_split_folio(folio, NULL)) >> goto activate_locked; >> /* >> - * Split folios without a PMD map right >> - * away. Chances are some or all of the >> - * tail pages can be freed without IO. >> + * Split PMD-mappable folios without a >> + * PMD map right away. Chances are some >> + * or all of the tail pages can be freed >> + * without IO. >> */ >> - if (!folio_entire_mapcount(folio) && >> + if (folio_test_pmd_mappable(folio) && >> + !folio_entire_mapcount(folio) && >> split_folio_to_list(folio, >> folio_list)) >> goto activate_locked; >> -- > > Chuanhua and I ran this patchset for a couple of days and found a race > between reclamation and split_folio. this might cause applications get > wrong data 0 while swapping-in. I can't claim to fully understand the problem yet (thanks for all the details - I'll keep reading it and looking at the code until I do), but I guess this problem should exist today for PMD-mappable folios? We already skip splitting those folios if they are pmd-mapped. Or does the problem only apply to pte-mapped folios?