On Mon, Apr 8, 2024 at 11:52 PM Huang, Ying <ying.huang@xxxxxxxxx> wrote: > > Yuanchu Xie <yuanchu@xxxxxxxxxx> writes: > > > When non-leaf pmd accessed bits are available, MGLRU page table walks > > can clear the accessed bit and promptly ignore the accessed bit on the > > pte because it's on a different node, so the walk does not update the > > generation of said page. When the next scan comes around on the right > > node, the non-leaf pmd accessed bit might remain cleared and the pte > > accessed bits won't be checked. While this is sufficient for > > reclaim-driven aging, where the goal is to select a reasonably cold > > page, the access can be missed when aging proactively for measuring the > > working set size of a node/memcg. > > > > Since force_scan disables various other optimizations, we check > > force_scan to ignore the non-leaf pmd accessed bit. > > > > Signed-off-by: Yuanchu Xie <yuanchu@xxxxxxxxxx> > > --- > > mm/vmscan.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > index 4f9c854ce6cc..1a7c7d537db6 100644 > > --- a/mm/vmscan.c > > +++ b/mm/vmscan.c > > @@ -3522,7 +3522,7 @@ static void walk_pmd_range(pud_t *pud, unsigned long start, unsigned long end, > > > > walk->mm_stats[MM_NONLEAF_TOTAL]++; > > > > - if (should_clear_pmd_young()) { > > + if (!walk->force_scan && should_clear_pmd_young()) { > > if (!pmd_young(val)) > > continue; > > Sorry, I don't understand why we need this. If !pmd_young(val), we > don't need to update the generation. If pmd_young(val), the bloom > filter will be ignored if force_scan == true. Or do I miss something? If !pmd_young(val), we still might need to update the generation. The get_pfn_folio function returns NULL if the folio's nid != node under scanning, so the pte accessed bit does not get cleared and the generation is not updated. Now the pmd_young flag of this pmd is cleared, and if none of the pte's are accessed before another round of scanning occurs on the folio's node, the pmd_young check fails and the pte accessed bit is skipped. This is fine for kswapd but can introduce inaccuracies when scanning proactively for workingset estimation. Thanks, Yuanchu