Re: [PATCH 3/3] arm, arm64: KVM: handle potential incoherency of readonly memslots

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/17/14 16:29, Paolo Bonzini wrote:
> 
> 
> On 17/11/2014 15:58, Ard Biesheuvel wrote:
>> Readonly memslots are often used to implement emulation of ROMs and
>> NOR flashes, in which case the guest may legally map these regions as
>> uncached.
>> To deal with the incoherency associated with uncached guest mappings,
>> treat all readonly memslots as incoherent, and ensure that pages that
>> belong to regions tagged as such are flushed to DRAM before being passed
>> to the guest.
> 
> On x86, the processor combines the cacheability values from the two
> levels of page tables.  Is there no way to do the same on ARM?

Combining occurs on ARMv8 too. The Stage1 (guest) mapping is very strict
(Device non-Gathering, non-Reordering, no Early Write Acknowledgement --
for EFI_MEMORY_UC), which basically "overrides" the Stage2 (very lax
host) memory attributes.

When qemu writes, as part of emulating the flash programming commands,
to the RAMBlock that *otherwise* backs the flash range (as a r/o
memslot), those writes (from host userspace) tend to end up in dcache.

But, when the guest flips back the flash to romd mode, and tries to read
back the values from the flash as plain ROM, the dcache is completely
bypassed due to the strict stage1 mapping, and the guest goes directly
to DRAM.

Where qemu's earlier writes are not yet / necessarily visible.

Please see my original patch (which was incomplete) in the attachment,
it has a very verbose commit message.

Anyway, I'll let others explain; they can word it better than I can :)

FWIW,

Series
Reviewed-by: Laszlo Ersek <lersek@xxxxxxxxxx>

I ported this series to a 3.17.0+ based kernel, and tested it. It works
fine. The ROM-like view of the NOR flash now reflects the previously
programmed contents.

Series
Tested-by: Laszlo Ersek <lersek@xxxxxxxxxx>

Thanks!
Laszlo
>From a2b4da9b03f03ccdb8b0988a5cc64d1967f00398 Mon Sep 17 00:00:00 2001
From: Laszlo Ersek <lersek@xxxxxxxxxx>
Date: Sun, 16 Nov 2014 01:43:11 +0100
Subject: [PATCH] arm, arm64: KVM: clean cache on page fault also when IPA is
 uncached (WIP)

This patch builds on Marc Zyngier's commit
2d58b733c87689d3d5144e4ac94ea861cc729145.

(1) The guest bypasses the cache *not only* when the VCPU's dcache is
    disabled (see bit 0 and bit 2 in SCTLR_EL1, "MMU enable" and "Cache
    enable", respectively -- vcpu_has_cache_enabled()).

    The guest bypasses the cache *also* when the Stage 1 memory attributes
    say "device memory" about the Intermediate Page Address in question,
    independently of the Stage 2 memory attributes. Refer to:

      Table D5-38
      Combining the stage 1 and stage 2 memory type assignments

    in the ARM ARM. (This is likely similar to MTRRs on x86.)

(2) In edk2 (EFI Development Kit II), the ARM NOR flash driver,

      ArmPlatformPkg/Drivers/NorFlashDxe/NorFlashFvbDxe.c

    uses the AddMemorySpace() and SetMemorySpaceAttributes() Global
    Coherency Domain Services of DXE (Driver eXecution Environment) to
    *justifiedly* set the attributes of the guest memory covering the
    flash chip to EFI_MEMORY_UC ("uncached").

    According to the AArch64 bindings for UEFI (see "2.3.6.1 Memory types"
    in the UEFI-2.4A specification), EFI_MEMORY_UC is mapped to:

                       ARM Memory Type:
                       MAIR attribute encoding    ARM Memory Type:
    EFI Memory Type    Attr<n> [7:4] [3:0]        Meaning
    ---------------    -----------------------    ------------------------
    EFI_MEMORY_UC      0000 0000                  Device-nGnRnE (Device
    (Not cacheable)                               non-Gathering,
                                                  non-Reordering, no Early
                                                  Write Acknowledgement)

    This is correctly implemented in edk2, in the ArmConfigureMmu()
    function, via the ArmSetMAIR() call and the MAIR_ATTR() macro:

    The TT_ATTR_INDX_DEVICE_MEMORY (== 0) memory attribute index, which is
    used for EFI_MEMORY_UC memory, is associated with the
    MAIR_ATTR_DEVICE_MEMORY (== 0x00, see above) memory attribute value,
    in the MAIR_ELx register.

As a consequence of (1) and (2), when edk2 code running in the guest
accesses an IPA falling in the flash range, it will completely bypass
the cache.

Therefore, when such a page is faulted in in user_mem_abort(), we must
flush the data cache; otherwise the guest will see stale data in the flash
chip.

This patch is not complete because I have no clue how to calculate the
memory attribute for "fault_ipa" in user_mem_abort(). Right now I set
"fault_ipa_uncached" to constant true, which might incur some performance
penalty for data faults, but it certainly improves correctness -- the
ArmVirtualizationPkg platform build of edk2 actually boots as a KVM guest
on APM Mustang.

Signed-off-by: Laszlo Ersek <lersek@xxxxxxxxxx>
---
 arch/arm/include/asm/kvm_mmu.h   |  5 +++--
 arch/arm64/include/asm/kvm_mmu.h |  5 +++--
 arch/arm/kvm/mmu.c               | 10 ++++++++--
 3 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index acb0d57..f867060 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -161,9 +161,10 @@ static inline bool vcpu_has_cache_enabled(struct kvm_vcpu *vcpu)
 }
 
 static inline void coherent_cache_guest_page(struct kvm_vcpu *vcpu, hva_t hva,
-					     unsigned long size)
+					     unsigned long size,
+					     bool ipa_uncached)
 {
-	if (!vcpu_has_cache_enabled(vcpu))
+	if (!vcpu_has_cache_enabled(vcpu) || ipa_uncached)
 		kvm_flush_dcache_to_poc((void *)hva, size);
 	
 	/*
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 0caf7a5..123b521 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -243,9 +243,10 @@ static inline bool vcpu_has_cache_enabled(struct kvm_vcpu *vcpu)
 }
 
 static inline void coherent_cache_guest_page(struct kvm_vcpu *vcpu, hva_t hva,
-					     unsigned long size)
+					     unsigned long size,
+					     bool ipa_uncached)
 {
-	if (!vcpu_has_cache_enabled(vcpu))
+	if (!vcpu_has_cache_enabled(vcpu) || ipa_uncached)
 		kvm_flush_dcache_to_poc((void *)hva, size);
 
 	if (!icache_is_aliasing()) {		/* PIPT */
diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 57a403a..e2323b7 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -847,6 +847,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	struct vm_area_struct *vma;
 	pfn_t pfn;
 	pgprot_t mem_type = PAGE_S2;
+	bool fault_ipa_uncached;
 
 	write_fault = kvm_is_write_fault(vcpu);
 	if (fault_status == FSC_PERM && !write_fault) {
@@ -913,6 +914,9 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	if (!hugetlb && !force_pte)
 		hugetlb = transparent_hugepage_adjust(&pfn, &fault_ipa);
 
+	/* TBD */
+	fault_ipa_uncached = true;
+
 	if (hugetlb) {
 		pmd_t new_pmd = pfn_pmd(pfn, mem_type);
 		new_pmd = pmd_mkhuge(new_pmd);
@@ -920,7 +924,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 			kvm_set_s2pmd_writable(&new_pmd);
 			kvm_set_pfn_dirty(pfn);
 		}
-		coherent_cache_guest_page(vcpu, hva & PMD_MASK, PMD_SIZE);
+		coherent_cache_guest_page(vcpu, hva & PMD_MASK, PMD_SIZE,
+					  fault_ipa_uncached);
 		ret = stage2_set_pmd_huge(kvm, memcache, fault_ipa, &new_pmd);
 	} else {
 		pte_t new_pte = pfn_pte(pfn, mem_type);
@@ -928,7 +933,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 			kvm_set_s2pte_writable(&new_pte);
 			kvm_set_pfn_dirty(pfn);
 		}
-		coherent_cache_guest_page(vcpu, hva, PAGE_SIZE);
+		coherent_cache_guest_page(vcpu, hva, PAGE_SIZE,
+					  fault_ipa_uncached);
 		ret = stage2_set_pte(kvm, memcache, fault_ipa, &new_pte,
 			pgprot_val(mem_type) == pgprot_val(PAGE_S2_DEVICE));
 	}
-- 
1.8.3.1


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux