Christoffer Dall <christoffer.dall@xxxxxxx> writes: > [Adding Andrea and Steve in CC] > > On Thu, Aug 23, 2018 at 04:33:42PM +0100, Marc Zyngier wrote: >> When triggering a CoW, we unmap the RO page via an MMU notifier >> (invalidate_range_start), and then populate the new PTE using another >> one (change_pte). In the meantime, we'll have copied the old page >> into the new one. >> >> The problem is that the data for the new page is sitting in the >> cache, and should the guest have an uncached mapping to that page >> (or its MMU off), following accesses will bypass the cache. >> >> In a way, this is similar to what happens on a translation fault: >> We need to clean the page to the PoC before mapping it. So let's just >> do that. >> >> This fixes a KVM unit test regression observed on a HiSilicon platform, >> and subsequently reproduced on Seattle. >> >> Fixes: a9c0e12ebee5 ("KVM: arm/arm64: Only clean the dcache on translation fault") >> Reported-by: Mike Galbraith <efault@xxxxxx> >> Signed-off-by: Marc Zyngier <marc.zyngier@xxxxxxx> >> --- >> virt/kvm/arm/mmu.c | 9 ++++++++- >> 1 file changed, 8 insertions(+), 1 deletion(-) >> >> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c >> index 1d90d79706bd..287c8e274655 100644 >> --- a/virt/kvm/arm/mmu.c >> +++ b/virt/kvm/arm/mmu.c >> @@ -1811,13 +1811,20 @@ static int kvm_set_spte_handler(struct kvm *kvm, gpa_t gpa, u64 size, void *data >> void kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte) >> { >> unsigned long end = hva + PAGE_SIZE; >> + kvm_pfn_t pfn = pte_pfn(pte); >> pte_t stage2_pte; >> >> if (!kvm->arch.pgd) >> return; >> >> trace_kvm_set_spte_hva(hva); >> - stage2_pte = pfn_pte(pte_pfn(pte), PAGE_S2); >> + >> + /* >> + * We've moved a page around, probably through CoW, so let's treat >> + * just like a translation fault and clean the cache to the PoC. >> + */ >> + clean_dcache_guest_page(pfn, PAGE_SIZE); >> + stage2_pte = pfn_pte(pfn, PAGE_S2); >> handle_hva_to_gpa(kvm, hva, end, &kvm_set_spte_handler, &stage2_pte); >> } > > How does this work for pmd mappings? kvm_set_spte_hva() isn't called for PMD mappings. But... > > Are we guaranteed that a pmd mapping (hugetlbfs or THP) is split before > a CoW happens? > > Steve tells me that we share THP mappings on fork and that we back THPs > by a zero page, so CoW with THP should be possible. > ...the problem seems to affect handling write permission faults for CoW or zero pages. The memory gets unmapped at stage 2 due to the invalidate notifier (in hugetlb_cow() for hugetlbfs and do_huge_pmd_wp_page() for THP) while the cache maintenance for the newly allocated page will be skipped due to the !FSC_PERM. Hmm... smells like there might be a problem here. I'll try and put together a fix. > Thanks, > Christoffer > _______________________________________________ > kvmarm mailing list > kvmarm@xxxxxxxxxxxxxxxxxxxxx > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm