Re: [PATCH v9 01/14] mm: x86, arm64: add arch_has_hw_pte_young()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Mar 11, 2022 at 3:55 AM Barry Song <21cnbao@xxxxxxxxx> wrote:
>
> On Wed, Mar 9, 2022 at 3:47 PM Yu Zhao <yuzhao@xxxxxxxxxx> wrote:
> >
> > Some architectures automatically set the accessed bit in PTEs, e.g.,
> > x86 and arm64 v8.2. On architectures that do not have this capability,
> > clearing the accessed bit in a PTE usually triggers a page fault
> > following the TLB miss of this PTE (to emulate the accessed bit).
> >
> > Being aware of this capability can help make better decisions, e.g.,
> > whether to spread the work out over a period of time to reduce bursty
> > page faults when trying to clear the accessed bit in many PTEs.
> >
> > Note that theoretically this capability can be unreliable, e.g.,
> > hotplugged CPUs might be different from builtin ones. Therefore it
> > should not be used in architecture-independent code that involves
> > correctness, e.g., to determine whether TLB flushes are required (in
> > combination with the accessed bit).
> >
> > Signed-off-by: Yu Zhao <yuzhao@xxxxxxxxxx>
> > Acked-by: Brian Geffon <bgeffon@xxxxxxxxxx>
> > Acked-by: Jan Alexander Steffens (heftig) <heftig@xxxxxxxxxxxxx>
> > Acked-by: Oleksandr Natalenko <oleksandr@xxxxxxxxxxxxxx>
> > Acked-by: Steven Barrett <steven@xxxxxxxxxxxx>
> > Acked-by: Suleiman Souhlal <suleiman@xxxxxxxxxx>
> > Acked-by: Will Deacon <will@xxxxxxxxxx>
> > Tested-by: Daniel Byrne <djbyrne@xxxxxxx>
> > Tested-by: Donald Carr <d@xxxxxxxxxxxxxxx>
> > Tested-by: Holger Hoffstätte <holger@xxxxxxxxxxxxxxxxxxxxxx>
> > Tested-by: Konstantin Kharlamov <Hi-Angel@xxxxxxxxx>
> > Tested-by: Shuang Zhai <szhai2@xxxxxxxxxxxxxxxx>
> > Tested-by: Sofia Trinh <sofia.trinh@edi.works>
> > Tested-by: Vaibhav Jain <vaibhav@xxxxxxxxxxxxx>
> > ---
>
> Reviewed-by: Barry Song <baohua@xxxxxxxxxx>

Thanks.

> i guess arch_has_hw_pte_young() isn't called that often in either
> mm/memory.c or mm/vmscan.c.
> Otherwise, moving to a static key might help. Is it?

MRS shouldn't be slower than either branch of a static key. With a
static key, we only can optimize one of the two cases.

There is a *theoretical* problem with MRS: ARM specs don't prohibit a
physical CPU to support both cases (on different logical CPUs).




[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux