Re: [PATCH 6/6] kvm,rcu,nohz: use RCU extended quiescent state when running KVM guest

Andy Lutomirski <luto@xxxxxxxxxxxxxx> · Tue, 10 Feb 2015 13:00:35 -0800

On Tue, Feb 10, 2015 at 12:42 PM, Paul E. McKenney
<paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> On Tue, Feb 10, 2015 at 12:19:28PM -0800, Andy Lutomirski wrote:
>> On Tue, Feb 10, 2015 at 12:14 PM, Paul E. McKenney
>> <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
>> > On Tue, Feb 10, 2015 at 11:59:09AM -0800, Andy Lutomirski wrote:
>> >> On 02/10/2015 06:41 AM, riel@xxxxxxxxxx wrote:
>> >> >From: Rik van Riel <riel@xxxxxxxxxx>
>> >> >
>> >> >The host kernel is not doing anything while the CPU is executing
>> >> >a KVM guest VCPU, so it can be marked as being in an extended
>> >> >quiescent state, identical to that used when running user space
>> >> >code.
>> >> >
>> >> >The only exception to that rule is when the host handles an
>> >> >interrupt, which is already handled by the irq code, which
>> >> >calls rcu_irq_enter and rcu_irq_exit.
>> >> >
>> >> >The guest_enter and guest_exit functions already switch vtime
>> >> >accounting independent of context tracking. Leave those calls
>> >> >where they are, instead of moving them into the context tracking
>> >> >code.
>> >> >
>> >> >Signed-off-by: Rik van Riel <riel@xxxxxxxxxx>
>> >> >---
>> >> >  include/linux/context_tracking.h       | 6 ++++++
>> >> >  include/linux/context_tracking_state.h | 1 +
>> >> >  include/linux/kvm_host.h               | 3 ++-
>> >> >  3 files changed, 9 insertions(+), 1 deletion(-)
>> >> >
>> >> >diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
>> >> >index 954253283709..b65fd1420e53 100644
>> >> >--- a/include/linux/context_tracking.h
>> >> >+++ b/include/linux/context_tracking.h
>> >> >@@ -80,10 +80,16 @@ static inline void guest_enter(void)
>> >> >             vtime_guest_enter(current);
>> >> >     else
>> >> >             current->flags |= PF_VCPU;
>> >> >+
>> >> >+    if (context_tracking_is_enabled())
>> >> >+            context_tracking_enter(IN_GUEST);
>> >>
>> >> Why the if statement?
>> >>
>> >> Also, have you checked how much this hurts guest lightweight
>> >> entry/exit latency?  Context tracking is shockingly expensive for
>> >> reasons I don't fully understand, but hopefully most of it is the
>> >> vtime stuff.  (Context tracking is *so* expensive that I almost
>> >> think we should set the performance taint flag if we enable it,
>> >> assuming that flag ended up getting merged.  Also, we should make
>> >> context tracking faster.)
>> >
>> > It turns out that context_tracking_is_enabled() is a static inline
>> > that uses a static_key, so the overhead should be minimal on platforms
>> > having a full implementation of static keys.
>>
>> Shouldn't we just fold that into context_tracking_xyz_enter?
>
> If I am not getting too confused, Rik did that initially, but it caused
> some pain for the ARM guys.  I don't see a performance downside, at
> least not for a modern compiler that does a decent job of inlining.

It's more of a tidiness issue to me than a performance issue.

>
>> Also, why does the vtime stuff depend on RCU extended quiescent
>> states?  To me, they seem mostly orthogonal other than the fact that
>> they hook into the same places.
>
> I might be missing your point, but...
>
> If there are no scheduling-clock interrupts, then the CPU needs to be
> in an extended quiescent state, otherwise you will get RCU CPU stall
> warnings and eventually OOM.  Similarly, if there are no scheduling-clock
> interupts, then you need to compute the vtime stuff based on start times
> and deltas instead of relying on a scheduling-clock interrupt that never
> comes.  So it isn't that the vtime and RCU stuff are directly related,
> but rather that they both must take evasive action if there are to be
> no scheduling-clock interrupts for an extended time period.

I'm probably missing something, but isn't vtime also used for accurate
CPU time stats?

--Andy
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html