On Fri, Mar 11, 2022 at 3:55 AM Barry Song <21cnbao@xxxxxxxxx> wrote: > > On Wed, Mar 9, 2022 at 3:47 PM Yu Zhao <yuzhao@xxxxxxxxxx> wrote: > > > > Some architectures automatically set the accessed bit in PTEs, e.g., > > x86 and arm64 v8.2. On architectures that do not have this capability, > > clearing the accessed bit in a PTE usually triggers a page fault > > following the TLB miss of this PTE (to emulate the accessed bit). > > > > Being aware of this capability can help make better decisions, e.g., > > whether to spread the work out over a period of time to reduce bursty > > page faults when trying to clear the accessed bit in many PTEs. > > > > Note that theoretically this capability can be unreliable, e.g., > > hotplugged CPUs might be different from builtin ones. Therefore it > > should not be used in architecture-independent code that involves > > correctness, e.g., to determine whether TLB flushes are required (in > > combination with the accessed bit). > > > > Signed-off-by: Yu Zhao <yuzhao@xxxxxxxxxx> > > Acked-by: Brian Geffon <bgeffon@xxxxxxxxxx> > > Acked-by: Jan Alexander Steffens (heftig) <heftig@xxxxxxxxxxxxx> > > Acked-by: Oleksandr Natalenko <oleksandr@xxxxxxxxxxxxxx> > > Acked-by: Steven Barrett <steven@xxxxxxxxxxxx> > > Acked-by: Suleiman Souhlal <suleiman@xxxxxxxxxx> > > Acked-by: Will Deacon <will@xxxxxxxxxx> > > Tested-by: Daniel Byrne <djbyrne@xxxxxxx> > > Tested-by: Donald Carr <d@xxxxxxxxxxxxxxx> > > Tested-by: Holger Hoffstätte <holger@xxxxxxxxxxxxxxxxxxxxxx> > > Tested-by: Konstantin Kharlamov <Hi-Angel@xxxxxxxxx> > > Tested-by: Shuang Zhai <szhai2@xxxxxxxxxxxxxxxx> > > Tested-by: Sofia Trinh <sofia.trinh@edi.works> > > Tested-by: Vaibhav Jain <vaibhav@xxxxxxxxxxxxx> > > --- > > Reviewed-by: Barry Song <baohua@xxxxxxxxxx> Thanks. > i guess arch_has_hw_pte_young() isn't called that often in either > mm/memory.c or mm/vmscan.c. > Otherwise, moving to a static key might help. Is it? MRS shouldn't be slower than either branch of a static key. With a static key, we only can optimize one of the two cases. There is a *theoretical* problem with MRS: ARM specs don't prohibit a physical CPU to support both cases (on different logical CPUs).