Hi Will > -----Original Message----- > From: Will Deacon <will@xxxxxxxxxx> > Sent: 2019年10月1日 20:54 > To: Justin He (Arm Technology China) <Justin.He@xxxxxxx> > Cc: Catalin Marinas <Catalin.Marinas@xxxxxxx>; Mark Rutland > <Mark.Rutland@xxxxxxx>; James Morse <James.Morse@xxxxxxx>; Marc > Zyngier <maz@xxxxxxxxxx>; Matthew Wilcox <willy@xxxxxxxxxxxxx>; Kirill A. > Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>; linux-arm- > kernel@xxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; linux- > mm@xxxxxxxxx; Punit Agrawal <punitagrawal@xxxxxxxxx>; Thomas > Gleixner <tglx@xxxxxxxxxxxxx>; Andrew Morton <akpm@linux- > foundation.org>; hejianet@xxxxxxxxx; Kaly Xin (Arm Technology China) > <Kaly.Xin@xxxxxxx> > Subject: Re: [PATCH v10 3/3] mm: fix double page fault on arm64 if PTE_AF > is cleared > > On Mon, Sep 30, 2019 at 09:57:40AM +0800, Jia He wrote: > > When we tested pmdk unit test [1] vmmalloc_fork TEST1 in arm64 guest, > there > > will be a double page fault in __copy_from_user_inatomic of > cow_user_page. > > > > Below call trace is from arm64 do_page_fault for debugging purpose > > [ 110.016195] Call trace: > > [ 110.016826] do_page_fault+0x5a4/0x690 > > [ 110.017812] do_mem_abort+0x50/0xb0 > > [ 110.018726] el1_da+0x20/0xc4 > > [ 110.019492] __arch_copy_from_user+0x180/0x280 > > [ 110.020646] do_wp_page+0xb0/0x860 > > [ 110.021517] __handle_mm_fault+0x994/0x1338 > > [ 110.022606] handle_mm_fault+0xe8/0x180 > > [ 110.023584] do_page_fault+0x240/0x690 > > [ 110.024535] do_mem_abort+0x50/0xb0 > > [ 110.025423] el0_da+0x20/0x24 > > > > The pte info before __copy_from_user_inatomic is (PTE_AF is cleared): > > [ffff9b007000] pgd=000000023d4f8003, pud=000000023da9b003, > pmd=000000023d4b3003, pte=360000298607bd3 > > > > As told by Catalin: "On arm64 without hardware Access Flag, copying > from > > user will fail because the pte is old and cannot be marked young. So we > > always end up with zeroed page after fork() + CoW for pfn mappings. we > > don't always have a hardware-managed access flag on arm64." > > > > This patch fix it by calling pte_mkyoung. Also, the parameter is > > changed because vmf should be passed to cow_user_page() > > > > Add a WARN_ON_ONCE when __copy_from_user_inatomic() returns > error > > in case there can be some obscure use-case.(by Kirill) > > > > [1] > https://github.com/pmem/pmdk/tree/master/src/test/vmmalloc_fork > > > > Signed-off-by: Jia He <justin.he@xxxxxxx> > > Reported-by: Yibo Cai <Yibo.Cai@xxxxxxx> > > Reviewed-by: Catalin Marinas <catalin.marinas@xxxxxxx> > > Acked-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> > > --- > > mm/memory.c | 99 > +++++++++++++++++++++++++++++++++++++++++++++-------- > > 1 file changed, 84 insertions(+), 15 deletions(-) > > > > diff --git a/mm/memory.c b/mm/memory.c > > index b1ca51a079f2..1f56b0118ef5 100644 > > --- a/mm/memory.c > > +++ b/mm/memory.c > > @@ -118,6 +118,13 @@ int randomize_va_space __read_mostly = > > 2; > > #endif > > > > +#ifndef arch_faults_on_old_pte > > +static inline bool arch_faults_on_old_pte(void) > > +{ > > + return false; > > +} > > +#endif > > Kirill has acked this, so I'm happy to take the patch as-is, however isn't > it the case that /most/ architectures will want to return true for > arch_faults_on_old_pte()? In which case, wouldn't it make more sense for > that to be the default, and have x86 and arm64 provide an override? For > example, aren't most architectures still going to hit the double fault > scenario even with your patch applied? No, after applying my patch series, only those architectures which don't provide setting access flag by hardware AND don't implement their arch_faults_on_old_pte will hit the double page fault. The meaning of true for arch_faults_on_old_pte() is "this arch doesn't have the hardware setting access flag way, it might cause page fault on an old pte" I don't want to change other architectures' default behavior here. So by default, arch_faults_on_old_pte() is false. Btw, currently I only observed this double pagefault on arm64's guest (host is ThunderX2). On X86 guest (host is Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz ), there is no such double pagefault. It has the similar setting access flag way by hardware. -- Cheers, Justin (Jia He)