On Thu, Jan 18, 2018 at 06:26:50PM +0530, Vinayak Menon wrote: > Based on Kirill's patch [1]. > > Currently, faultaround code produces young pte. This can screw up > vmscan behaviour[2], as it makes vmscan think that these pages are hot > and not push them out on first round. > > During sparse file access faultaround gets more pages mapped and all of > them are young. Under memory pressure, this makes vmscan swap out anon > pages instead, or to drop other page cache pages which otherwise stay > resident. > > Modify faultaround to produce old ptes if sysctl 'want_old_faultaround_pte' > is set, so they can easily be reclaimed under memory pressure. > > This can to some extend defeat the purpose of faultaround on machines > without hardware accessed bit as it will not help us with reducing the > number of minor page faults. > > Making the faultaround ptes old results in a unixbench regression for some > architectures [3][4]. But on some architectures like arm64 it is not found > to cause any regression. > > unixbench shell8 scores on arm64 v8.2 hardware with CONFIG_ARM64_HW_AFDBM > enabled (5 runs min, max, avg): > Base: (741,748,744) > With this patch: (739,748,743) > > So by default produce young ptes and provide a sysctl option to make the > ptes old. > > [1] http://lkml.kernel.org/r/1463488366-47723-1-git-send-email-kirill.shutemov@xxxxxxxxxxxxxxx > [2] https://lkml.kernel.org/r/1460992636-711-1-git-send-email-vinmenon@xxxxxxxxxxxxxx > [3] https://marc.info/?l=linux-kernel&m=146582237922378&w=2 > [4] https://marc.info/?l=linux-mm&m=146589376909424&w=2 > > Signed-off-by: Vinayak Menon <vinmenon@xxxxxxxxxxxxxx> > --- > > V2: > 1. Removed the arch hook and want_old_faultaround_pte is made a sysctl > 2. Renamed FAULT_FLAG_MKOLD to FAULT_FLAG_PREFAULT_OLD (suggested by Jan Kara) > 3. Removed the saved fault address from vmf (suggested by Jan Kara) > > Documentation/sysctl/vm.txt | 22 ++++++++++++++++++++++ > include/linux/mm.h | 3 +++ > kernel/sysctl.c | 9 +++++++++ > mm/filemap.c | 10 ++++++++++ > mm/memory.c | 4 ++++ > 5 files changed, 48 insertions(+) > > diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt > index 17256f2..e015940 100644 > --- a/Documentation/sysctl/vm.txt > +++ b/Documentation/sysctl/vm.txt > @@ -63,6 +63,7 @@ Currently, these files are in /proc/sys/vm: > - vfs_cache_pressure > - watermark_scale_factor > - zone_reclaim_mode > +- want_old_faultaround_pte > > ============================================================== > > @@ -887,4 +888,25 @@ Allowing regular swap effectively restricts allocations to the local > node unless explicitly overridden by memory policies or cpuset > configurations. > > +============================================================= > + > +want_old_faultaround_pte: > + > +By default faultaround code produces young pte. When want_old_faultaround_pte is > +set to 1, faultaround produces old ptes. > + > +During sparse file access faultaround gets more pages mapped and when all of > +them are young (default), under memory pressure, this makes vmscan swap out anon > +pages instead, or to drop other page cache pages which otherwise stay resident. > +Setting want_old_faultaround_pte to 1 avoids this. > + > +Making the faultaround ptes old can result in performance regression on some > +architectures. This is due to cycles spent in micro-fault for TLB lookup of old > +entry. It's not for TLB lookup. Micro-fault would take page walk to set young bit in the pte. Otherwise patch looks good to me. Acked-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> -- Kirill A. Shutemov -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>