Re: [PATCH 3/4] arm/arm64: KVM: Flush caches to memory on unmap

Christoffer Dall <christoffer.dall@xxxxxxxxxx> · Sun, 11 Jan 2015 13:30:01 +0100

On Fri, Jan 09, 2015 at 02:35:49PM +0000, Marc Zyngier wrote:
> On 09/01/15 12:30, Christoffer Dall wrote:
> > On Thu, Jan 08, 2015 at 11:59:08AM +0000, Marc Zyngier wrote:
> >> Let's assume a guest has created an uncached mapping, and written
> >> to that page. Let's also assume that the host uses a cache-coherent
> >> IO subsystem. Let's finally assume that the host is under memory
> >> pressure and starts to swap things out.
> >>
> >> Before this "uncached" page is evicted, we need to make sure it
> >> gets invalidated, or the IO subsystem is going to swap out the
> >> cached view, loosing the data that has been written there.
> >>
> >> Signed-off-by: Marc Zyngier <marc.zyngier@xxxxxxx>
> >> ---
> >>  arch/arm/include/asm/kvm_mmu.h   | 31 +++++++++++++++++++++++++++
> >>  arch/arm/kvm/mmu.c               | 46 +++++++++++++++++++++++++++-------------
> >>  arch/arm64/include/asm/kvm_mmu.h | 18 ++++++++++++++++
> >>  3 files changed, 80 insertions(+), 15 deletions(-)
> >>
> >> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> >> index 63e0ecc..7ceb836 100644
> >> --- a/arch/arm/include/asm/kvm_mmu.h
> >> +++ b/arch/arm/include/asm/kvm_mmu.h
> >> @@ -44,6 +44,7 @@
> >>  
> >>  #ifndef __ASSEMBLY__
> >>  
> >> +#include <linux/highmem.h>
> >>  #include <asm/cacheflush.h>
> >>  #include <asm/pgalloc.h>
> >>  
> >> @@ -190,6 +191,36 @@ static inline void coherent_cache_guest_page(struct kvm_vcpu *vcpu, hva_t hva,
> >>  
> >>  #define kvm_virt_to_phys(x)		virt_to_idmap((unsigned long)(x))
> >>  
> >> +static inline void __kvm_flush_dcache_pte(pte_t pte)
> >> +{
> >> +	void *va = kmap_atomic(pte_page(pte));
> >> +
> >> +	kvm_flush_dcache_to_poc(va, PAGE_SIZE);
> >> +
> >> +	kunmap_atomic(va);
> >> +}
> >> +
> >> +static inline void __kvm_flush_dcache_pmd(pmd_t pmd)
> >> +{
> >> +	unsigned long size = PMD_SIZE;
> >> +	pfn_t pfn = pmd_pfn(pmd);
> >> +
> >> +	while (size) {
> >> +		void *va = kmap_atomic_pfn(pfn);
> >> +
> >> +		kvm_flush_dcache_to_poc(va, PAGE_SIZE);
> >> +
> >> +		pfn++;
> >> +		size -= PAGE_SIZE;
> >> +
> >> +		kunmap_atomic(va);
> >> +	}
> >> +}
> >> +
> >> +static inline void __kvm_flush_dcache_pud(pud_t pud)
> >> +{
> >> +}
> >> +
> >>  void stage2_flush_vm(struct kvm *kvm);
> >>  
> >>  #endif	/* !__ASSEMBLY__ */
> >> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> >> index 1dc9778..1f5b793 100644
> >> --- a/arch/arm/kvm/mmu.c
> >> +++ b/arch/arm/kvm/mmu.c
> >> @@ -58,6 +58,21 @@ static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
> >>  		kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, kvm, ipa);
> >>  }
> >>  
> >> +static void kvm_flush_dcache_pte(pte_t pte)
> >> +{
> >> +	__kvm_flush_dcache_pte(pte);
> >> +}
> >> +
> >> +static void kvm_flush_dcache_pmd(pmd_t pmd)
> >> +{
> >> +	__kvm_flush_dcache_pmd(pmd);
> >> +}
> >> +
> >> +static void kvm_flush_dcache_pud(pud_t pud)
> >> +{
> >> +	__kvm_flush_dcache_pud(pud);
> >> +}
> >> +
> >>  static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
> >>  				  int min, int max)
> >>  {
> >> @@ -128,9 +143,12 @@ static void unmap_ptes(struct kvm *kvm, pmd_t *pmd,
> >>  	start_pte = pte = pte_offset_kernel(pmd, addr);
> >>  	do {
> >>  		if (!pte_none(*pte)) {
> >> +			pte_t old_pte = *pte;
> >>  			kvm_set_pte(pte, __pte(0));
> >> -			put_page(virt_to_page(pte));
> > 
> > was this a bug beforehand that we released the page before we flushed
> > the TLB?
> 
> I don't think so. TLB maintenance doesn't require the mapping to exist
> in the page tables (while the cache maintenance do).
> 

duh, the put_page is the ref counting on the page table itself, not the
underlying memory page.  Forget what I asked.

> >>  			kvm_tlb_flush_vmid_ipa(kvm, addr);
> >> +			if ((pte_val(old_pte) & PAGE_S2_DEVICE) != PAGE_S2_DEVICE)
> >> +				kvm_flush_dcache_pte(old_pte);
> > 
> > this is confusing me: We are only flushing the cache for cached stage-2
> > mappings?  Weren't we trying to flush the cache for uncached mappings?
> > (we obviously also need flush a cached stage-2 mapping but where the
> > guest is mapping it as uncached, but we don't know that easily).
> > 
> > Am I missing something completely here?
> 
> I think you must be misreading something:
> - we want to invalidate mappings because the guest may have created an
> uncached mapping

I don't quite understand: we are invalidating mappings because the page
is being swapped out (and the guest must fault if it tries to access it
again).  Not because the guest may have created an uncached mapping,
that's just an aspect of the situation.  Or am I thinking about this the
wrong way?

> - as we cannot know about the guest's uncached mappings, we flush things
> unconditionally (basically anything that is RAM).

so isn't the problem that the host may have an invalid cached view of
the data, so we need to invalidate that view, not flush the invalid data
to RAM?  Does the kernel take care of that somewhere else for a
cache-coherent IO subsystem?

> - we avoid flushing devices because it is pointless (and the kernel
> doesn't have a linear mapping for those).

Ah, that explains the if statement.  I think a one-line comment about
this would be helpful (I didn't get it, obviously).

-Christoffer
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html