> -----Original Message----- > From: Wood Scott-B07421 > Sent: Tuesday, August 06, 2013 12:49 AM > To: Bhushan Bharat-R65777 > Cc: Benjamin Herrenschmidt; Wood Scott-B07421; agraf@xxxxxxx; kvm- > ppc@xxxxxxxxxxxxxxx; kvm@xxxxxxxxxxxxxxx; linuxppc-dev@xxxxxxxxxxxxxxxx > Subject: Re: [PATCH 5/6 v2] kvm: powerpc: booke: Add linux pte lookup like > booke3s > > On Mon, 2013-08-05 at 09:27 -0500, Bhushan Bharat-R65777 wrote: > > > > > -----Original Message----- > > > From: Benjamin Herrenschmidt [mailto:benh@xxxxxxxxxxxxxxxxxxx] > > > Sent: Saturday, August 03, 2013 9:54 AM > > > To: Bhushan Bharat-R65777 > > > Cc: Wood Scott-B07421; agraf@xxxxxxx; kvm-ppc@xxxxxxxxxxxxxxx; > > > kvm@xxxxxxxxxxxxxxx; linuxppc-dev@xxxxxxxxxxxxxxxx > > > Subject: Re: [PATCH 5/6 v2] kvm: powerpc: booke: Add linux pte > > > lookup like booke3s > > > > > > On Sat, 2013-08-03 at 02:58 +0000, Bhushan Bharat-R65777 wrote: > > > > One of the problem I saw was that if I put this code in > > > > asm/pgtable-32.h and asm/pgtable-64.h then pte_persent() and other > > > > friend function (on which this code depends) are defined in pgtable.h. > > > > And pgtable.h includes asm/pgtable-32.h and asm/pgtable-64.h > > > > before it defines pte_present() and friends functions. > > > > > > > > Ok I move wove this in asm/pgtable*.h, initially I fought with > > > > myself to take this code in pgtable* but finally end up doing here > > > > (got biased by book3s :)). > > > > > > Is there a reason why these routines can not be completely generic > > > in pgtable.h ? > > > > How about the generic function: > > > > diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h > > b/arch/powerpc/include/asm/pgtable-ppc64.h > > index d257d98..21daf28 100644 > > --- a/arch/powerpc/include/asm/pgtable-ppc64.h > > +++ b/arch/powerpc/include/asm/pgtable-ppc64.h > > @@ -221,6 +221,27 @@ static inline unsigned long pte_update(struct mm_struct > *mm, > > return old; > > } > > > > +static inline unsigned long pte_read(pte_t *p) { #ifdef > > +PTE_ATOMIC_UPDATES > > + pte_t pte; > > + pte_t tmp; > > + __asm__ __volatile__ ( > > + "1: ldarx %0,0,%3\n" > > + " andi. %1,%0,%4\n" > > + " bne- 1b\n" > > + " ori %1,%0,%4\n" > > + " stdcx. %1,0,%3\n" > > + " bne- 1b" > > + : "=&r" (pte), "=&r" (tmp), "=m" (*p) > > + : "r" (p), "i" (_PAGE_BUSY) > > + : "cc"); > > + > > + return pte; > > +#else > > + return pte_val(*p); > > +#endif > > +#endif > > +} > > static inline int __ptep_test_and_clear_young(struct mm_struct *mm, > > unsigned long addr, > > pte_t *ptep) > > Please leave a blank line between functions. > > > { > > diff --git a/arch/powerpc/include/asm/pgtable.h > > b/arch/powerpc/include/asm/pgtable.h > > index 690c8c2..dad712c 100644 > > --- a/arch/powerpc/include/asm/pgtable.h > > +++ b/arch/powerpc/include/asm/pgtable.h > > @@ -254,6 +254,45 @@ static inline pte_t > > *find_linux_pte_or_hugepte(pgd_t *pgdir, unsigned long ea, } #endif > > /* !CONFIG_HUGETLB_PAGE */ > > > > +static inline pte_t lookup_linux_pte(pgd_t *pgdir, unsigned long hva, > > + int writing, unsigned long > > +*pte_sizep) > > The name implies that it just reads the PTE. Setting accessed/dirty shouldn't > be an undocumented side-effect. Ok, will rename and document. > Why can't the caller do that (or a different > function that the caller calls afterward if desired)? The current implementation in book3s is; 1) find a pte/hugepte 2) return null if pte not present 3) take _PAGE_BUSY lock 4) set accessed/dirty 5) clear _PAGE_BUSY. What I tried was 1) find a pte/hugepte 2) return null if pte not present 3) return pte (not take lock by not setting _PAGE_BUSY) 4) then user calls __ptep_set_access_flags() to atomic update the dirty/accessed flags in pte. - but the benchmark results were not good - Also can there be race as we do not take lock in step 3 and update in step 4 ? > > Though even then you have the undocumented side effect of locking the PTE on > certain targets. > > > +{ > > + pte_t *ptep; > > + pte_t pte; > > + unsigned long ps = *pte_sizep; > > + unsigned int shift; > > + > > + ptep = find_linux_pte_or_hugepte(pgdir, hva, &shift); > > + if (!ptep) > > + return __pte(0); > > + if (shift) > > + *pte_sizep = 1ul << shift; > > + else > > + *pte_sizep = PAGE_SIZE; > > + > > + if (ps > *pte_sizep) > > + return __pte(0); > > + > > + if (!pte_present(*ptep)) > > + return __pte(0); > > + > > +#ifdef CONFIG_PPC64 > > + /* Lock PTE (set _PAGE_BUSY) and read */ > > + pte = pte_read(ptep); > > +#else > > + pte = pte_val(*ptep); > > +#endif > > What about 32-bit platforms that need atomic PTEs? I called __ptep_set_access_flags() for both 32/64bit (for 64bit I was not calling pte_read()), which handles atomic updates. Somehow the benchmark result were not good, will try again. Thanks -Bharat > > -Scott > ��.n��������+%������w��{.n�����o�^n�r������&��z�ޗ�zf���h���~����������_��+v���)ߣ�