Re: [PATCH 1/5] kvm: add exit_to_guest_mode() and enter_from_guest_mode()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Mark,

On Tue, 11 Jan 2022 15:35:35 +0000,
Mark Rutland <mark.rutland@xxxxxxx> wrote:
> 
> When transitioning to/from guest mode, it is necessary to inform
> lockdep, tracing, and RCU in a specific order, similar to the
> requirements for transitions to/from user mode. Additionally, it is
> necessary to perform vtime accounting for a window around running the
> guest, with RCU enabled, such that timer interrupts taken from the guest
> can be accounted as guest time.
> 
> Most architectures don't handle all the necessary pieces, and a have a
> number of common bugs, including unsafe usage of RCU during the window
> between guest_enter() and guest_exit().
> 
> On x86, this was dealt with across commits:
> 
>   87fa7f3e98a1310e ("x86/kvm: Move context tracking where it belongs")
>   0642391e2139a2c1 ("x86/kvm/vmx: Add hardirq tracing to guest enter/exit")
>   9fc975e9efd03e57 ("x86/kvm/svm: Add hardirq tracing on guest enter/exit")
>   3ebccdf373c21d86 ("x86/kvm/vmx: Move guest enter/exit into .noinstr.text")
>   135961e0a7d555fc ("x86/kvm/svm: Move guest enter/exit into .noinstr.text")
>   160457140187c5fb ("KVM: x86: Defer vtime accounting 'til after IRQ handling")
>   bc908e091b326467 ("KVM: x86: Consolidate guest enter/exit logic to common helpers")
> 
> ... but those fixes are specific to x86, and as the resulting logic
> (while correct) is split across generic helper functions and
> x86-specific helper functions, it is difficult to see that the
> entry/exit accounting is balanced.
> 
> This patch adds generic helpers which architectures can use to handle
> guest entry/exit consistently and correctly. The guest_{enter,exit}()
> helpers are split into guest_timing_{enter,exit}() to perform vtime
> accounting, and guest_context_{enter,exit}() to perform the necessary
> context tracking and RCU management. The existing guest_{enter,exit}()
> heleprs are left as wrappers of these.
> 
> Atop this, new exit_to_guest_mode() and enter_from_guest_mode() helpers
> are added to handle the ordering of lockdep, tracing, and RCU manageent.
> These are named to align with exit_to_user_mode() and
> enter_from_user_mode().
> 
> Subsequent patches will migrate architectures over to the new helpers,
> following a sequence:
> 
> 	guest_timing_enter_irqoff();
> 
> 	exit_to_guest_mode();
> 	< run the vcpu >
> 	enter_from_guest_mode();
> 
> 	< take any pending IRQs >
> 
> 	guest_timing_exit_irqoff();
> 
> This sequences handles all of the above correctly, and more clearly
> balances the entry and exit portions, making it easier to understand.
> 
> The existing helpers are marked as deprecated, and will be removed once
> all architectures have been converted.
> 
> There should be no functional change as a result of this patch.
> 
> Signed-off-by: Mark Rutland <mark.rutland@xxxxxxx>

Thanks a lot for looking into this and writing this up. I have a
couple of comments below, but that's pretty much cosmetic and is only
there to ensure that I actually understand this stuff. FWIW:

Reviewed-by: Marc Zyngier <maz@xxxxxxxxxx>

> ---
>  include/linux/kvm_host.h | 108 +++++++++++++++++++++++++++++++++++++--
>  1 file changed, 105 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index c310648cc8f1..13fcf7979880 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -29,6 +29,8 @@
>  #include <linux/refcount.h>
>  #include <linux/nospec.h>
>  #include <linux/notifier.h>
> +#include <linux/ftrace.h>
> +#include <linux/instrumentation.h>
>  #include <asm/signal.h>
>  
>  #include <linux/kvm.h>
> @@ -362,8 +364,11 @@ struct kvm_vcpu {
>  	int last_used_slot;
>  };
>  
> -/* must be called with irqs disabled */
> -static __always_inline void guest_enter_irqoff(void)
> +/*
> + * Start accounting time towards a guest.
> + * Must be called before entering guest context.
> + */
> +static __always_inline void guest_timing_enter_irqoff(void)
>  {
>  	/*
>  	 * This is running in ioctl context so its safe to assume that it's the
> @@ -372,7 +377,17 @@ static __always_inline void guest_enter_irqoff(void)
>  	instrumentation_begin();
>  	vtime_account_guest_enter();
>  	instrumentation_end();
> +}
>  
> +/*
> + * Enter guest context and enter an RCU extended quiescent state.
> + *
> + * This should be the last thing called before entering the guest, and must be
> + * called after any potential use of RCU (including any potentially
> + * instrumented code).

nit: "the last thing called" is terribly ambiguous. Any architecture
obviously calls a ****load of stuff after this point. Should this be
'the last thing involving RCU' instead?

> + */
> +static __always_inline void guest_context_enter_irqoff(void)
> +{
>  	/*
>  	 * KVM does not hold any references to rcu protected data when it
>  	 * switches CPU into a guest mode. In fact switching to a guest mode
> @@ -388,16 +403,77 @@ static __always_inline void guest_enter_irqoff(void)
>  	}
>  }
>  
> -static __always_inline void guest_exit_irqoff(void)
> +/*
> + * Deprecated. Architectures should move to guest_timing_enter_irqoff() and
> + * exit_to_guest_mode().
> + */
> +static __always_inline void guest_enter_irqoff(void)
> +{
> +	guest_timing_enter_irqoff();
> +	guest_context_enter_irqoff();
> +}
> +
> +/**
> + * exit_to_guest_mode - Fixup state when exiting to guest mode
> + *
> + * This is analagous to exit_to_user_mode(), and ensures we perform the
> + * following in order:
> + *
> + * 1) Trace interrupts on state
> + * 2) Invoke context tracking if enabled to adjust RCU state
> + * 3) Tell lockdep that interrupts are enabled

nit: or rather, are about to be enabled? Certainly on arm64, the
enable happens much later, right at the point where we enter the guest
for real.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux