On Mon, Jun 3, 2024 at 8:06 PM Yuanchu Xie <yuanchu@xxxxxxxxxx> wrote: > > When non-leaf pmd accessed bits are available, MGLRU page table walks > can clear the non-leaf pmd accessed bit and ignore the accessed bit on > the pte if it's on a different node, skipping a generation update as > well. If another scan occurrs on the same node as said skipped pte. > the non-leaf pmd accessed bit might remain cleared and the pte accessed > bits won't be checked. While this is sufficient for reclaim-driven > aging, where the goal is to select a reasonably cold page, the access > can be missed when aging proactively for workingset estimation of a of a > node/memcg. > > In more detail, get_pfn_folio returns NULL if the folio's nid != node > under scanning, so the page table walk skips processing of said pte. Now > the pmd_young flag on this pmd is cleared, and if none of the pte's are > accessed before another scan occurrs on the folio's node, the pmd_young > check fails and the pte accessed bit is skipped. > > Since force_scan disables various other optimizations, we check > force_scan to ignore the non-leaf pmd accessed bit. > > Signed-off-by: Yuanchu Xie <yuanchu@xxxxxxxxxx> > --- > mm/vmscan.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index d55e8d07ffc4..73f3718b33f7 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -3548,7 +3548,7 @@ static void walk_pmd_range(pud_t *pud, unsigned long start, unsigned long end, > > walk->mm_stats[MM_NONLEAF_TOTAL]++; > > - if (should_clear_pmd_young()) { > + if (!walk->force_scan && should_clear_pmd_young()) { > if (!pmd_young(val)) > continue; What about the other should_clear_pmd_young() in walk_pmd_range_locked()? With that and the typos fixed, we should probably split this patch out, since it can get reviewed and merged independently.