[Android-virt] [PATCH] ARM: KVM: Trap and propagate cache maintainance by set/way

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 31 May 2012 05:17:41 +0100, Marc Zyngier <marc.zyngier at arm.com> wrote:
> The ARM ARM says in bold (B1.14.4):
> "Virtualizing a uniprocessor system within an MP system, permitting a
>  virtual machine to move between different physical processors, makes
>  cache maintenance by set/way difficult. This is because a set/way
>  operation might be interrupted part way through its operation, and
>  therefore the hypervisor must reproduce the effect of the maintenance
>  on both physical processors."
> 
> The direct consequence of this is that we have to trap all set/way
> operations and make sure the other CPUs get the memo. In order to
> avoid performance degradation, we maintain a per vcpu cpumask that
> tracks the physcal CPUs on which the cache operation must be performed.
> The remote operation is only executed when migrating the vcpu.
> 
> On the receiving end, we simply clean+invalidate the whole data cache
> to avoid queuiing up individual set/way operations.
> 
> Reported-by: Peter Maydell <peter.maydell at linaro.org>
> Cc: Will Deacon <will.deacon at arm.com>
> Cc: Rusty Russell <rusty.russell at linaro.org>
> Signed-off-by: Marc Zyngier <marc.zyngier at arm.com>
> ---
>  arch/arm/include/asm/kvm_arm.h  |    3 ++-
>  arch/arm/include/asm/kvm_host.h |    3 +++
>  arch/arm/kvm/arm.c              |   15 +++++++++++++++
>  arch/arm/kvm/emulate.c          |   39 +++++++++++++++++++++++++++++++++++++++
>  4 files changed, 59 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_arm.h b/arch/arm/include/asm/kvm_arm.h
> index a28b5f0..5bdbe61 100644
> --- a/arch/arm/include/asm/kvm_arm.h
> +++ b/arch/arm/include/asm/kvm_arm.h
> @@ -53,6 +53,7 @@
>   * The bits we set in HCR:
>   * TAC:		Trap ACTLR
>   * TSC:		Trap SMC
> + * TSW:		Trap cache operations by set/way
>   * TWI:		Trap WFI
>   * BSU_IS:	Upgrade barriers to the inner shareable domain
>   * FB:		Force broadcast of all maintainance operations
> @@ -61,7 +62,7 @@
>   * FMO:		Override CPSR.F and enable signaling with VF
>   * SWIO:	Turn set/way invalidates into set/way clean+invalidate
>   */
> -#define HCR_GUEST_MASK (HCR_TSC | HCR_TWI | HCR_VM | HCR_BSU_IS | HCR_FB | \
> +#define HCR_GUEST_MASK (HCR_TSC | HCR_TSW | HCR_TWI | HCR_VM | HCR_BSU_IS | HCR_FB | \
>  			HCR_TAC | HCR_AMO | HCR_IMO | HCR_FMO | HCR_SWIO)
>  #define HCR_VIRT_EXCP_MASK (HCR_VA | HCR_VI | HCR_VF)
>  
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index 734a107..69ee513 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -113,6 +113,9 @@ struct kvm_vcpu_arch {
>  	u32 hpfar;		/* Hyp IPA Fault Address Register */
>  	u64 pc_ipa;		/* IPA for the current PC (VA to PA result) */
>  
> +	/* dcache set/way operation pending */
> +	cpumask_t require_dcache_flush;

struct cpumask is my preferred method for modern code (well,
cpumask_var_t would be even better, with an appropriate init, even
though it's going to be a long time before ARM needs offstack
cpumasks!).

> +static bool write_dcsw(struct kvm_vcpu *vcpu,
> +		       const struct coproc_params *p,
> +		       unsigned long cp15_reg)
> +{
> +	cpumask_var_t tmpmask;
> +
> +	if (!alloc_cpumask_var(&tmpmask, GFP_KERNEL))
> +		return false;
> +
> +	switch(p->CRm) {
> +	case 6:			/* Upgrade DCISW to DCCISW, as per HCR.SWIO */
> +	case 14:		/* DCCISW */
> +		asm volatile("mcr p15, 0, %0, c7, c14, 2" : : "r" (p->Rt1));
> +		break;
> +
> +	case 10:		/* DCCSW */
> +		asm volatile("mcr p15, 0, %0, c7, c10, 2" : : "r" (p->Rt1));
> +		break;
> +	}
> +
> +	cpumask_complement(tmpmask, cpumask_of(smp_processor_id()));
> +	cpumask_or(&vcpu->arch.require_dcache_flush,
> +		   &vcpu->arch.require_dcache_flush,
> +		   tmpmask);
> +
> +	free_cpumask_var(tmpmask);

I appreciate the correct use of cpumask_var_t, even though it's going to
be a while until ARM needs offstack cpumasks, but I think this is a bit
more convolted than it need be.

(You do need to disable preemption, though):

    int cpu = get_cpu();

    switch(p->CRm) {
        ...
    }

    /* Everyone else needs to flush. */
    cpumask_setall(&vcpu->arch.require_dcache_flush);
    cpumask_clear_cpu(&vcpu->arch.require_dcache_flush, cpu);
    put_cpu();
                                             
Cheers,
Rusty.


[Index of Archives]     [Linux KVM]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux