Re: [PATCH 2/2] x86: kvm: hyperv: don't retry message delivery for periodic timers

Vitaly Kuznetsov <vkuznets@xxxxxxxxxx> · Tue, 11 Dec 2018 16:09:33 +0100

Roman Kagan <rkagan@xxxxxxxxxxxxx> writes:

> On Tue, Dec 11, 2018 at 01:45:05PM +0100, Vitaly Kuznetsov wrote:
>> Roman Kagan <rkagan@xxxxxxxxxxxxx> writes:
>> 
>> > The SynIC message delivery protocol allows the message originator to
>> > request, should the message slot be busy, to be notified when it's free.
>> >
>> > However, this is unnecessary and even undesirable for messages generated
>> > by SynIC timers in periodic mode: if the period is short enough compared
>> > to the time the guest spends in the timer interrupt handler, so the
>> > timer ticks start piling up, the excessive interactions due to this
>> > notification and retried message delivery only makes the things worse.
>> >
>> > [This was observed, in particular, with Windows L2 guests setting
>> > (temporarily) the periodic timer to 2 kHz, and spending hundreds of
>> > microseconds in the timer interrupt handler due to several L2->L1 exits;
>> > under some load in L0 this could exceed 500 us so the timer ticks
>> > started to pile up and the guest livelocked.]
>> >
>> > Relieve the situation somewhat by not retrying message delivery for
>> > periodic SynIC timers.  This appears to remain within the "lazy" lost
>> > ticks policy for SynIC timers as implemented in KVM.
>> >
>> > Note that it doesn't solve the fundamental problem of livelocking the
>> > guest with a periodic timer whose period is smaller than the time needed
>> > to process a tick, but it makes it a bit less likely to be triggered.
>> >
>> 
>> We seem to be ignoring 'lazy' setting for stimers completely; Going
>> forward, do you think it would make sense to try to implement both
>> lazy and non-lazy modes for stimers as they're described in TLFS
>> (basically, we need to implement 'non-lazy' mode)?
>
> The alternative to "lazy" is "catch-up".  I guess it's only relevant to
> guests that do timekeeping by tick counting.  Hopefully no sane guest
> does it.  Especially so since we're talking about paravirtualized
> interfaces, and the guests using them should be aware that tick counting
> can't be relied upon when running under hypervisor.
>

Ok, so it seems there's no point in complicating the existing code, at
least untill there's a compelling use-case.

>> (now I'm interested in which mode Windows and Hyper-V actually use).
>> 
>> > Signed-off-by: Roman Kagan <rkagan@xxxxxxxxxxxxx>
>> > ---
>> >  arch/x86/kvm/hyperv.c | 13 +++++++++++--
>> >  1 file changed, 11 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
>> > index 17d04d2c6d4f..3f2b93ad9ccf 100644
>> > --- a/arch/x86/kvm/hyperv.c
>> > +++ b/arch/x86/kvm/hyperv.c
>> > @@ -557,7 +557,7 @@ static int stimer_get_count(struct kvm_vcpu_hv_stimer *stimer, u64 *pcount)
>> >  }
>> >  
>> >  static int synic_deliver_msg(struct kvm_vcpu_hv_synic *synic, u32 sint,
>> > -			     struct hv_message *src_msg)
>> > +			     struct hv_message *src_msg, bool no_retry)
>> >  {
>> >  	struct kvm_vcpu *vcpu = synic_to_vcpu(synic);
>> >  	int msg_off = offsetof(struct hv_message_page, sint_message[sint]);
>> > @@ -584,6 +584,9 @@ static int synic_deliver_msg(struct kvm_vcpu_hv_synic *synic, u32 sint,
>> >  		return r;
>> >  
>> >  	if (hv_hdr.message_type != HVMSG_NONE) {
>> > +		if (no_retry)
>> > +			return 0;
>> > +
>> >  		hv_hdr.message_flags.msg_pending = 1;
>> >  		r = kvm_vcpu_write_guest_page(vcpu, msg_page_gfn,
>> >  					      &hv_hdr.message_flags,
>> > @@ -616,11 +619,17 @@ static int stimer_send_msg(struct kvm_vcpu_hv_stimer *stimer)
>> >  	struct hv_message *msg = &stimer->msg;
>> >  	struct hv_timer_message_payload *payload =
>> >  			(struct hv_timer_message_payload *)&msg->u.payload;
>> > +	/*
>> > +	 * don't retry message delivery for periodic ticks, otherwise they may
>> > +	 * pile up
>> > +	 */
>> 
>> Capital 'D' and hard stop at the end would be appreciated :-) Also, it
>> may make sense to mention our 'laziness' here.
>
> 	/*
> 	 * To avoid piling up periodic ticks, don't retry message
> 	 * delivery for them (within "lazy" lost ticks policy).
> 	 */
>
> ?

Looks good to me!

-- 
Vitaly