On Mon, Apr 22, 2024 at 11:00:39AM +0800, Kefeng Wang wrote: > Adding userfaultfd_wp() check in vmf_orig_pte_uffd_wp() to avoid the > unnecessary FAULT_FLAG_ORIG_PTE_VALID check/pte_marker_entry_uffd_wp() > in most pagefault, note, the function vmf_orig_pte_uffd_wp() is not > inlined in the two kernel versions, the difference is shown below, > > perf date, > > perf report -i perf.data.before | grep vmf > 0.17% 0.13% lat_pagefault [kernel.kallsyms] [k] vmf_orig_pte_uffd_wp.part.0.isra.0 > perf report -i perf.data.after | grep vmf > > lat_pagefault -W 5 -N 5 /tmp/XXX > latency before after diff > average(8 tests) 0.262675 0.2600375 -0.0026375 > > Although it's a small, but the uffd_wp is a new feature than previous > kernel, when the vma is not registered with UFFD_WP, let's avoid to > execute the new logical, also adding __always_inline attribute to > vmf_orig_pte_uffd_wp(), which make set_pte_range() only check VM_UFFD_WP > flags without the function call. In addition, directly call the > vmf_orig_pte_uffd_wp() in do_anonymous_page() and set_pte_range() > to save an uffd_wp variable. > > Signed-off-by: Kefeng Wang <wangkefeng.wang@xxxxxxxxxx> Reviewed-by: Peter Xu <peterx@xxxxxxxxxx> -- Peter Xu