On Mon, May 09, 2016 at 05:50:56PM +0100, Catalin Marinas wrote: > On Mon, May 09, 2016 at 05:33:10PM +0200, Christoffer Dall wrote: > > On Wed, Apr 13, 2016 at 05:57:37PM +0100, Catalin Marinas wrote: > > > The ARMv8.1 architecture extensions introduce support for hardware > > > updates of the access and dirty information in page table entries. With > > > VTCR_EL2.HA enabled (bit 21), when the CPU accesses an IPA with the > > > PTE_AF bit cleared in the stage 2 page table, instead of raising an > > > Access Flag fault to EL2 the CPU sets the actual page table entry bit > > > (10). To ensure that kernel modifications to the page table do not > > > inadvertently revert a bit set by hardware updates, certain Stage 2 > > > software pte/pmd operations must be performed atomically. > > > > > > The main user of the AF bit is the kvm_age_hva() mechanism. The > > > kvm_age_hva_handler() function performs a "test and clear young" action > > > on the pte/pmd. This needs to be atomic in respect of automatic hardware > > > updates of the AF bit. Since the AF bit is in the same position for both > > > Stage 1 and Stage 2, the patch reuses the existing > > > ptep_test_and_clear_young() functionality if > > > __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG is defined. Otherwise, the > > > existing pte_young/pte_mkold mechanism is preserved. > > > > > > The kvm_set_s2pte_readonly() (and the corresponding pmd equivalent) have > > > to perform atomic modifications in order to avoid a race with updates of > > > the AF bit. The arm64 implementation has been re-written using > > > exclusives. > > > > > > Currently, kvm_set_s2pte_writable() (and pmd equivalent) take a pointer > > > argument and modify the pte/pmd in place. However, these functions are > > > only used on local variables rather than actual page table entries, so > > > it makes more sense to follow the pte_mkwrite() approach for stage 1 > > > attributes. The change to kvm_s2pte_mkwrite() makes it clear that these > > > functions do not modify the actual page table entries. > > > > so if one CPU takes a permission fault and is in the process of updating > > the page table, what prevents another CPU from setting the access flag > > (on a read, done by HW) on that entry between us reading the old pte and > > replacing it with the new pte? > > There isn't anything that would prevent this. However, as in the stage 1 > scenarios, explicitly writing a PTE as a result of a permission fault > should also mark the entry as accessed (young, either as the default > protection in PROT_DEFAULT or explicitly via pte_mkyoung). > > > Don't we loose the AF information in that case too? > > Since the hardware never clears the AF bit automatically, there is no > information to lose in this race as long as the kvm_set_s2pte_writable() > sets the AF bit. AFAICT, the PAGE_S2 and PAGE_S2_DEVICE definitions > already use PROT_DEFAULT which has the AF bit set. duh, yeah, I didn't consider the overall use case we're targeting. Thanks for reminding me. > > > > static inline void kvm_set_s2pte_readonly(pte_t *pte) > > > { > > > - pte_val(*pte) = (pte_val(*pte) & ~PTE_S2_RDWR) | PTE_S2_RDONLY; > > > + pteval_t pteval; > > > + unsigned long tmp; > > > + > > > + asm volatile("// kvm_set_s2pte_readonly\n" > > > + " prfm pstl1strm, %2\n" > > > + "1: ldxr %0, %2\n" > > > + " and %0, %0, %3 // clear PTE_S2_RDWR\n" > > > + " orr %0, %0, %4 // set PTE_S2_RDONLY\n" > > > + " stxr %w1, %0, %2\n" > > > + " cbnz %w1, 1b\n" > > > + : "=&r" (pteval), "=&r" (tmp), "+Q" (pte_val(*pte)) > > > > so the +Q means "pass the memory address of the value and by the way the > > content, not the address itself, can be updated by the assembly code" ? > > Yes. I think "info gcc" isn't clear enough on this constraint (at least > to me) but that's something we learnt from the tools guys in the early > days of the aarch64 kernel port. > Just finding anything about the meaning of this turned out to be its own challenge. In any case: Reviewed-by: Christoffer Dall <christoffer.dall@xxxxxxxxxx> And I have applied it to kvmarm/next. -Christoffer _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/mailman/listinfo/kvmarm