> -----Original Message----- > From: Justin He (Arm Technology China) > Sent: 2019年10月8日 9:55 > To: Marc Zyngier <maz@xxxxxxxxxx>; Will Deacon <will@xxxxxxxxxx> > Cc: Catalin Marinas <Catalin.Marinas@xxxxxxx>; Mark Rutland > <Mark.Rutland@xxxxxxx>; James Morse <James.Morse@xxxxxxx>; > Matthew Wilcox <willy@xxxxxxxxxxxxx>; Kirill A. Shutemov > <kirill.shutemov@xxxxxxxxxxxxxxx>; linux-arm-kernel@xxxxxxxxxxxxxxxxxxx; > linux-kernel@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx; Punit Agrawal > <punitagrawal@xxxxxxxxx>; Thomas Gleixner <tglx@xxxxxxxxxxxxx>; > Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>; hejianet@xxxxxxxxx; Kaly > Xin (Arm Technology China) <Kaly.Xin@xxxxxxx>; nd <nd@xxxxxxx> > Subject: RE: [PATCH v10 2/3] arm64: mm: implement > arch_faults_on_old_pte() on arm64 > > Hi Will and Marc > > > -----Original Message----- > > From: Marc Zyngier <maz@xxxxxxxxxx> > > Sent: 2019年10月1日 21:32 > > To: Will Deacon <will@xxxxxxxxxx> > > Cc: Justin He (Arm Technology China) <Justin.He@xxxxxxx>; Catalin > > Marinas <Catalin.Marinas@xxxxxxx>; Mark Rutland > > <Mark.Rutland@xxxxxxx>; James Morse <James.Morse@xxxxxxx>; > > Matthew Wilcox <willy@xxxxxxxxxxxxx>; Kirill A. Shutemov > > <kirill.shutemov@xxxxxxxxxxxxxxx>; linux-arm-kernel@xxxxxxxxxxxxxxxxxxx; > > linux-kernel@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx; Punit Agrawal > > <punitagrawal@xxxxxxxxx>; Thomas Gleixner <tglx@xxxxxxxxxxxxx>; > > Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>; hejianet@xxxxxxxxx; > Kaly > > Xin (Arm Technology China) <Kaly.Xin@xxxxxxx> > > Subject: Re: [PATCH v10 2/3] arm64: mm: implement > > arch_faults_on_old_pte() on arm64 > > > > On Tue, 1 Oct 2019 13:50:32 +0100 > > Will Deacon <will@xxxxxxxxxx> wrote: > > > > > On Mon, Sep 30, 2019 at 09:57:39AM +0800, Jia He wrote: > > > > On arm64 without hardware Access Flag, copying fromuser will fail > > because > > > > the pte is old and cannot be marked young. So we always end up with > > zeroed > > > > page after fork() + CoW for pfn mappings. we don't always have a > > > > hardware-managed access flag on arm64. > > > > > > > > Hence implement arch_faults_on_old_pte on arm64 to indicate that > it > > might > > > > cause page fault when accessing old pte. > > > > > > > > Signed-off-by: Jia He <justin.he@xxxxxxx> > > > > Reviewed-by: Catalin Marinas <catalin.marinas@xxxxxxx> > > > > --- > > > > arch/arm64/include/asm/pgtable.h | 14 ++++++++++++++ > > > > 1 file changed, 14 insertions(+) > > > > > > > > diff --git a/arch/arm64/include/asm/pgtable.h > > b/arch/arm64/include/asm/pgtable.h > > > > index 7576df00eb50..e96fb82f62de 100644 > > > > --- a/arch/arm64/include/asm/pgtable.h > > > > +++ b/arch/arm64/include/asm/pgtable.h > > > > @@ -885,6 +885,20 @@ static inline void update_mmu_cache(struct > > vm_area_struct *vma, > > > > #define phys_to_ttbr(addr) (addr) > > > > #endif > > > > > > > > +/* > > > > + * On arm64 without hardware Access Flag, copying from user will > fail > > because > > > > + * the pte is old and cannot be marked young. So we always end up > > with zeroed > > > > + * page after fork() + CoW for pfn mappings. We don't always have a > > > > + * hardware-managed access flag on arm64. > > > > + */ > > > > +static inline bool arch_faults_on_old_pte(void) > > > > +{ > > > > + WARN_ON(preemptible()); > > > > + > > > > + return !cpu_has_hw_af(); > > > > +} > > > > > > Does this work correctly in a KVM guest? (i.e. is the MMFR sanitised in > > that > > > case, despite not being the case on the host?) > > > > Yup, all the 64bit MMFRs are trapped (HCR_EL2.TID3 is set for an > > AArch64 guest), and we return the sanitised version. > Thanks for Marc's explanation. I verified the patch series on a kvm guest (- > M virt) > with simulated nvdimm device created by qemu. The host is ThunderX2 > aarch64. > > > > > But that's an interesting remark: we're now trading an extra fault on > > CPUs that do not support HWAFDBS for a guaranteed trap for each and > > every guest under the sun that will hit the COW path... > > > > My gut feeling is that this is going to be pretty visible. Jia, do you > > have any numbers for this kind of behaviour? > It is not a common COW path, but a COW for PFN mapping pages only. > I add a g_counter before pte_mkyoung in force_mkyoung{} when testing > vmmalloc_fork at [1]. > > In this test case, it will start M fork processes and N pthreads. The default is > M=2,N=4. the g_counter is about 241, that is it will hit my patch series for > 241 > times. > If I set M=20 and N=40 for TEST3, the g_counter is about 1492. The time overhead of test vmmalloc_fork is: real 0m5.411s user 0m4.206s sys 0m2.699s > > [1] https://github.com/pmem/pmdk/tree/master/src/test/vmmalloc_fork > > > -- > Cheers, > Justin (Jia He) >