Re: [PATCH 2/5] kvm/arm64: rework guest entry logic

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 11 Jan 2022 15:35:36 +0000,
Mark Rutland <mark.rutland@xxxxxxx> wrote:
> 
> In kvm_arch_vcpu_ioctl_run() we enter an RCU extended quiescent state
> (EQS) by calling guest_enter_irqoff(), and unmasked IRQs prior to
> exiting the EQS by calling guest_exit(). As the IRQ entry code will not
> wake RCU in this case, we may run the core IRQ code and IRQ handler
> without RCU watching, leading to various potential problems.
> 
> Additionally, we do not inform lockdep or tracing that interrupts will
> be enabled during guest execution, which caan lead to misleading traces
> and warnings that interrupts have been enabled for overly-long periods.
> 
> This patch fixes these issues by using the new timing and context
> entry/exit helpers to ensure that interrupts are handled during guest
> vtime but with RCU watching, with a sequence:
> 
> 	guest_timing_enter_irqoff();
> 
> 	exit_to_guest_mode();
> 	< run the vcpu >
> 	enter_from_guest_mode();
> 
> 	< take any pending IRQs >
> 
> 	guest_timing_exit_irqoff();
> 
> Since instrumentation may make use of RCU, we must also ensure that no
> instrumented code is run during the EQS. I've split out the critical
> section into a new kvm_arm_enter_exit_vcpu() helper which is marked
> noinstr.
> 
> Fixes: 1b3d546daf85ed2b ("arm/arm64: KVM: Properly account for guest CPU time")
> Reported-by: Nicolas Saenz Julienne <nsaenzju@xxxxxxxxxx>
> Signed-off-by: Mark Rutland <mark.rutland@xxxxxxx>
> Cc: Alexandru Elisei <alexandru.elisei@xxxxxxx>
> Cc: Catalin Marinas <catalin.marinas@xxxxxxx>
> Cc: Frederic Weisbecker <frederic@xxxxxxxxxx>
> Cc: James Morse <james.morse@xxxxxxx>
> Cc: Marc Zyngier <maz@xxxxxxxxxx>
> Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> Cc: Paul E. McKenney <paulmck@xxxxxxxxxx>
> Cc: Suzuki K Poulose <suzuki.poulose@xxxxxxx>
> Cc: Will Deacon <will@xxxxxxxxxx>
> ---
>  arch/arm64/kvm/arm.c | 51 ++++++++++++++++++++++++++++----------------
>  1 file changed, 33 insertions(+), 18 deletions(-)
> 
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index e4727dc771bf..1721df2522c8 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -764,6 +764,24 @@ static bool kvm_vcpu_exit_request(struct kvm_vcpu *vcpu, int *ret)
>  			xfer_to_guest_mode_work_pending();
>  }
>  
> +/*
> + * Actually run the vCPU, entering an RCU extended quiescent state (EQS) while
> + * the vCPU is running.
> + *
> + * This must be noinstr as instrumentation may make use of RCU, and this is not
> + * safe during the EQS.
> + */
> +static int noinstr kvm_arm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
> +{
> +	int ret;
> +
> +	exit_to_guest_mode();
> +	ret = kvm_call_hyp_ret(__kvm_vcpu_run, vcpu);
> +	enter_from_guest_mode();
> +
> +	return ret;
> +}
> +
>  /**
>   * kvm_arch_vcpu_ioctl_run - the main VCPU run function to execute guest code
>   * @vcpu:	The VCPU pointer
> @@ -854,9 +872,9 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
>  		 * Enter the guest
>  		 */
>  		trace_kvm_entry(*vcpu_pc(vcpu));
> -		guest_enter_irqoff();
> +		guest_timing_enter_irqoff();
>  
> -		ret = kvm_call_hyp_ret(__kvm_vcpu_run, vcpu);
> +		ret = kvm_arm_vcpu_enter_exit(vcpu);
>  
>  		vcpu->mode = OUTSIDE_GUEST_MODE;
>  		vcpu->stat.exits++;
> @@ -891,26 +909,23 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
>  		kvm_arch_vcpu_ctxsync_fp(vcpu);
>  
>  		/*
> -		 * We may have taken a host interrupt in HYP mode (ie
> -		 * while executing the guest). This interrupt is still
> -		 * pending, as we haven't serviced it yet!
> +		 * We must ensure that any pending interrupts are taken before
> +		 * we exit guest timing so that timer ticks are accounted as
> +		 * guest time. Transiently unmask interrupts so that any
> +		 * pending interrupts are taken.
>  		 *
> -		 * We're now back in SVC mode, with interrupts
> -		 * disabled.  Enabling the interrupts now will have
> -		 * the effect of taking the interrupt again, in SVC
> -		 * mode this time.
> +		 * Per ARM DDI 0487G.b section D1.13.4, an ISB (or other
> +		 * context synchronization event) is necessary to ensure that
> +		 * pending interrupts are taken.
>  		 */
>  		local_irq_enable();
> +		isb();
> +		local_irq_disable();

Small nit: we may be able to elide this enable/isb/disable dance if a
read of ISR_EL1 returns 0.

> +
> +		guest_timing_exit_irqoff();
> +
> +		local_irq_enable();
>  
> -		/*
> -		 * We do local_irq_enable() before calling guest_exit() so
> -		 * that if a timer interrupt hits while running the guest we
> -		 * account that tick as being spent in the guest.  We enable
> -		 * preemption after calling guest_exit() so that if we get
> -		 * preempted we make sure ticks after that is not counted as
> -		 * guest time.
> -		 */
> -		guest_exit();
>  		trace_kvm_exit(ret, kvm_vcpu_trap_get_class(vcpu), *vcpu_pc(vcpu));
>  
>  		/* Exit types that need handling before we can be preempted */

Reviewed-by: Marc Zyngier <maz@xxxxxxxxxx>

	M.

-- 
Without deviation from the norm, progress is not possible.



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux