The patch titled Subject: mm: clean up the last pieces of page fault accountings has been added to the -mm tree. Its filename is mm-clean-up-the-last-pieces-of-page-fault-accountings.patch This patch should soon appear at http://ozlabs.org/~akpm/mmots/broken-out/mm-clean-up-the-last-pieces-of-page-fault-accountings.patch and later at http://ozlabs.org/~akpm/mmotm/broken-out/mm-clean-up-the-last-pieces-of-page-fault-accountings.patch Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: Peter Xu <peterx@xxxxxxxxxx> Subject: mm: clean up the last pieces of page fault accountings Here're the last pieces of page fault accounting that were still done outside handle_mm_fault() where we still have regs==NULL when calling handle_mm_fault(): arch/powerpc/mm/copro_fault.c: copro_handle_mm_fault arch/sparc/mm/fault_32.c: force_user_fault arch/um/kernel/trap.c: handle_page_fault mm/gup.c: faultin_page fixup_user_fault mm/hmm.c: hmm_vma_fault mm/ksm.c: break_ksm Some of them has the issue of duplicated accounting for page fault retries. Some of them didn't do the accounting at all. This patch cleans all these up by letting handle_mm_fault() to do per-task page fault accounting even if regs==NULL (though we'll still skip the perf event accountings). With that, we can safely remove all the outliers now. There's another functional change in that now we account the page faults to the caller of gup, rather than the task_struct that passed into the gup code. More information of this can be found at [1]. After this patch, below things should never be touched again outside handle_mm_fault(): - task_struct.[maj|min]_flt - PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN] [1] https://lore.kernel.org/lkml/CAHk-=wj_V2Tps2QrMn20_W0OJF9xqNh52XSGA42s-ZJ8Y+GyKw@xxxxxxxxxxxxxx/ Link: http://lkml.kernel.org/r/20200707225021.200906-25-peterx@xxxxxxxxxx Signed-off-by: Peter Xu <peterx@xxxxxxxxxx> Cc: Albert Ou <aou@xxxxxxxxxxxxxxxxx> Cc: Alexander Gordeev <agordeev@xxxxxxxxxxxxx> Cc: Andy Lutomirski <luto@xxxxxxxxxx> Cc: Benjamin Herrenschmidt <benh@xxxxxxxxxxxxxxxxxxx> Cc: Borislav Petkov <bp@xxxxxxxxx> Cc: Brian Cain <bcain@xxxxxxxxxxxxxx> Cc: Catalin Marinas <catalin.marinas@xxxxxxx> Cc: Christian Borntraeger <borntraeger@xxxxxxxxxx> Cc: Chris Zankel <chris@xxxxxxxxxx> Cc: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx> Cc: David S. Miller <davem@xxxxxxxxxxxxx> Cc: Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> Cc: Gerald Schaefer <gerald.schaefer@xxxxxxxxxx> Cc: Greentime Hu <green.hu@xxxxxxxxx> Cc: Guo Ren <guoren@xxxxxxxxxx> Cc: Heiko Carstens <heiko.carstens@xxxxxxxxxx> Cc: Helge Deller <deller@xxxxxx> Cc: H. Peter Anvin <hpa@xxxxxxxxx> Cc: Ingo Molnar <mingo@xxxxxxxxxx> Cc: Ivan Kokshaysky <ink@xxxxxxxxxxxxxxxxxxxx> Cc: James E.J. Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> Cc: John Hubbard <jhubbard@xxxxxxxxxx> Cc: Jonas Bonn <jonas@xxxxxxxxxxxx> Cc: Ley Foon Tan <ley.foon.tan@xxxxxxxxx> Cc: "Luck, Tony" <tony.luck@xxxxxxxxx> Cc: Matt Turner <mattst88@xxxxxxxxx> Cc: Max Filippov <jcmvbkbc@xxxxxxxxx> Cc: Michael Ellerman <mpe@xxxxxxxxxxxxxx> Cc: Michal Simek <monstr@xxxxxxxxx> Cc: Nick Hu <nickhu@xxxxxxxxxxxxx> Cc: Palmer Dabbelt <palmer@xxxxxxxxxxx> Cc: Paul Mackerras <paulus@xxxxxxxxx> Cc: Paul Walmsley <paul.walmsley@xxxxxxxxxx> Cc: Pekka Enberg <penberg@xxxxxxxxxx> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx> Cc: Richard Henderson <rth@xxxxxxxxxxx> Cc: Rich Felker <dalias@xxxxxxxx> Cc: Russell King <linux@xxxxxxxxxxxxxxx> Cc: Stafford Horne <shorne@xxxxxxxxx> Cc: Stefan Kristiansson <stefan.kristiansson@xxxxxxxxxxxxx> Cc: Thomas Bogendoerfer <tsbogend@xxxxxxxxxxxxxxxx> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx> Cc: Vasily Gorbik <gor@xxxxxxxxxxxxx> Cc: Vincent Chen <deanbo422@xxxxxxxxx> Cc: Vineet Gupta <vgupta@xxxxxxxxxxxx> Cc: Will Deacon <will@xxxxxxxxxx> Cc: Yoshinori Sato <ysato@xxxxxxxxxxxxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- arch/powerpc/mm/copro_fault.c | 5 ----- arch/um/kernel/trap.c | 4 ---- mm/gup.c | 13 ------------- mm/memory.c | 17 ++++++++++------- 4 files changed, 10 insertions(+), 29 deletions(-) --- a/arch/powerpc/mm/copro_fault.c~mm-clean-up-the-last-pieces-of-page-fault-accountings +++ a/arch/powerpc/mm/copro_fault.c @@ -76,11 +76,6 @@ int copro_handle_mm_fault(struct mm_stru BUG(); } - if (*flt & VM_FAULT_MAJOR) - current->maj_flt++; - else - current->min_flt++; - out_unlock: mmap_read_unlock(mm); return ret; --- a/arch/um/kernel/trap.c~mm-clean-up-the-last-pieces-of-page-fault-accountings +++ a/arch/um/kernel/trap.c @@ -88,10 +88,6 @@ good_area: BUG(); } if (flags & FAULT_FLAG_ALLOW_RETRY) { - if (fault & VM_FAULT_MAJOR) - current->maj_flt++; - else - current->min_flt++; if (fault & VM_FAULT_RETRY) { flags |= FAULT_FLAG_TRIED; --- a/mm/gup.c~mm-clean-up-the-last-pieces-of-page-fault-accountings +++ a/mm/gup.c @@ -893,13 +893,6 @@ static int faultin_page(struct task_stru BUG(); } - if (tsk) { - if (ret & VM_FAULT_MAJOR) - tsk->maj_flt++; - else - tsk->min_flt++; - } - if (ret & VM_FAULT_RETRY) { if (locked && !(fault_flags & FAULT_FLAG_RETRY_NOWAIT)) *locked = 0; @@ -1255,12 +1248,6 @@ retry: goto retry; } - if (tsk) { - if (major) - tsk->maj_flt++; - else - tsk->min_flt++; - } return 0; } EXPORT_SYMBOL_GPL(fixup_user_fault); --- a/mm/memory.c~mm-clean-up-the-last-pieces-of-page-fault-accountings +++ a/mm/memory.c @@ -4409,20 +4409,23 @@ static inline void mm_account_fault(stru */ major = (ret & VM_FAULT_MAJOR) || (flags & FAULT_FLAG_TRIED); + if (major) + current->maj_flt++; + else + current->min_flt++; + /* - * If the fault is done for GUP, regs will be NULL, and we will skip - * the fault accounting. + * If the fault is done for GUP, regs will be NULL. We only do the + * accounting for the per thread fault counters who triggered the + * fault, and we skip the perf event updates. */ if (!regs) return; - if (major) { - current->maj_flt++; + if (major) perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1, regs, address); - } else { - current->min_flt++; + else perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1, regs, address); - } } /* _ Patches currently in -mm which might be from peterx@xxxxxxxxxx are mm-do-page-fault-accounting-in-handle_mm_fault.patch mm-alpha-use-general-page-fault-accounting.patch mm-arc-use-general-page-fault-accounting.patch mm-arm-use-general-page-fault-accounting.patch mm-arm64-use-general-page-fault-accounting.patch mm-csky-use-general-page-fault-accounting.patch mm-hexagon-use-general-page-fault-accounting.patch mm-ia64-use-general-page-fault-accounting.patch mm-m68k-use-general-page-fault-accounting.patch mm-microblaze-use-general-page-fault-accounting.patch mm-mips-use-general-page-fault-accounting.patch mm-nds32-use-general-page-fault-accounting.patch mm-nios2-use-general-page-fault-accounting.patch mm-openrisc-use-general-page-fault-accounting.patch mm-parisc-use-general-page-fault-accounting.patch mm-powerpc-use-general-page-fault-accounting.patch mm-riscv-use-general-page-fault-accounting.patch mm-s390-use-general-page-fault-accounting.patch mm-sh-use-general-page-fault-accounting.patch mm-sparc32-use-general-page-fault-accounting.patch mm-sparc64-use-general-page-fault-accounting.patch mm-x86-use-general-page-fault-accounting.patch mm-xtensa-use-general-page-fault-accounting.patch mm-clean-up-the-last-pieces-of-page-fault-accountings.patch mm-gup-remove-task_struct-pointer-for-all-gup-code.patch