On 05/02/2024 09:51, Barry Song wrote: > +Chris, Suren and Chuanhua > > Hi Ryan, > >> + /* >> + * __scan_swap_map_try_ssd_cluster() may drop si->lock during discard, >> + * so indicate that we are scanning to synchronise with swapoff. >> + */ >> + si->flags += SWP_SCANNING; >> + ret = __scan_swap_map_try_ssd_cluster(si, &offset, &scan_base, order); >> + si->flags -= SWP_SCANNING; > > nobody is using this scan_base afterwards. it seems a bit weird to > pass a pointer. > >> --- a/mm/vmscan.c >> +++ b/mm/vmscan.c >> @@ -1212,11 +1212,13 @@ static unsigned int shrink_folio_list(struct list_head *folio_list, >> if (!can_split_folio(folio, NULL)) >> goto activate_locked; >> /* >> - * Split folios without a PMD map right >> - * away. Chances are some or all of the >> - * tail pages can be freed without IO. >> + * Split PMD-mappable folios without a >> + * PMD map right away. Chances are some >> + * or all of the tail pages can be freed >> + * without IO. >> */ >> - if (!folio_entire_mapcount(folio) && >> + if (folio_test_pmd_mappable(folio) && >> + !folio_entire_mapcount(folio) && >> split_folio_to_list(folio, >> folio_list)) >> goto activate_locked; >> -- > > Chuanhua and I ran this patchset for a couple of days and found a race > between reclamation and split_folio. this might cause applications get > wrong data 0 while swapping-in. > > in case one thread(T1) is reclaiming a large folio by some means, still > another thread is calling madvise MADV_PGOUT(T2). and at the same time, > we have two threads T3 and T4 to swap-in in parallel. T1 doesn't split > and T2 does split as below, Hi Barry, Do you have a test case you can share that provokes this problem? And is this a separate problem to the race you solved with TTU_SYNC or is this solving the same problem? Thanks, Ryan