Re: [PATCH v2] KVM: x86: Use fast path for Xen timer delivery

David Woodhouse <dwmw2@xxxxxxxxxxxxx> · Mon, 02 Oct 2023 20:33:41 +0100

On Mon, 2023-10-02 at 11:45 -0700, Sean Christopherson wrote:
> On Mon, Oct 02, 2023, David Woodhouse wrote:
> > On Mon, 2023-10-02 at 10:41 -0700, Sean Christopherson wrote:
> > > On Fri, Sep 29, 2023, David Woodhouse wrote:
> > > > On Fri, 2023-09-29 at 08:16 -0700, Sean Christopherson wrote:
> > > > > On Fri, Sep 29, 2023, David Woodhouse wrote:
> > > > > > From: David Woodhouse <dwmw@xxxxxxxxxxxx>
> > > > > > 
> > > > > > Most of the time there's no need to kick the vCPU and deliver the timer
> > > > > > event through kvm_xen_inject_timer_irqs(). Use kvm_xen_set_evtchn_fast()
> > > > > > directly from the timer callback, and only fall back to the slow path
> > > > > > when it's necessary to do so.
> > > > > 
> > > > > It'd be helpful for non-Xen folks to explain "when it's necessary".  IIUC, the
> > > > > only time it's necessary is if the gfn=>pfn cache isn't valid/fresh.
> > > > 
> > > > That's an implementation detail.
> > > 
> > > And?  The target audience of changelogs are almost always people that care about
> > > the implementation.
> > > 
> > > > Like all of the fast path functions that can be called from
> > > > kvm_arch_set_irq_inatomic(), it has its own criteria for why it might return
> > > > -EWOULDBLOCK or not. Those are *its* business.
> > > 
> > > And all of the KVM code is the business of the people who contribute to the kernel,
> > > now and in the future.  Yeah, there's a small chance that a detailed changelog can
> > > become stale if the patch races with some other in-flight change, but even *that*
> > > is a useful data point.  E.g. if Paul's patches somehow broke/degraded this code,
> > > then knowing that what the author (you) intended/observed didn't match reality when
> > > the patch was applied would be extremely useful information for whoever encountered
> > > the hypothetical breakage.
> > 
> > Fair enough, but on this occasion it truly doesn't matter. It has
> > nothing to do with the implementation of *this* patch. This code makes
> > no assumptions and has no dependency on *when* that fast path might
> > return -EWOULDBLOCK. Sometimes it does, sometimes it doesn't. This code
> > just doesn't care one iota.
> > 
> > If this code had *dependencies* on the precise behaviour of
> > kvm_xen_set_evtchn_fast() that we needed to reason about, then sure,
> > I'd have written those explicitly into the commit comment *and* tried
> > to find some way of enforcing them with runtime warnings etc.
> > 
> > But it doesn't. So I am no more inclined to document the precise
> > behaviour of kvm_xen_set_evtchn_fast() in a patch which just happens to
> > call it, than I am inclined to document hrtimer_cancel() or any other
> > function called from the new code :)
> 
> Just because some bit of code doesn't care/differentiate doesn't mean the behavior
> of said code is correct.  I agree that adding a comment to explain the gory details
> is unnecessary and would lead to stale code.  But changelogs essentially capture a
> single point in a time, and a big role of the changelog is to help reviewers and
> readers understand (a) the *intent* of the change and (b) whether or not that change
> is correct.
> 
> E.g. there's an assumption that -EWOULDBLOCK is the only non-zero return code where
> the correct response is to go down the slow path.
> 
> I'm not asking to spell out every single condition, I'm just asking for clarification
> on what the intended behavior is, e.g.
> 
>   Use kvm_xen_set_evtchn_fast() directly from the timer callback, and fall
>   back to the slow path if the event is valid but fast delivery isn't
>   possible, which currently can only happen if delivery needs to block,
>   e.g. because the gfn=>pfn cache is invalid or stale.
> 
> instead of simply saying "when it's necessary to do so" and leaving it up to the
> reader to figure what _they_ think that means, which might not always align with
> what the author actually meant.

Fair enough. There's certainly scope for something along the lines of

+	rc = kvm_xen_set_evtchn_fast(&e, vcpu->kvm);
+	if (rc != -EWOULDBLOCK) {

   /*
    * If kvm_xen_set_evtchn_fast() returned -EWOULDBLOCK, then set the
    * timer_pending flag and kick the vCPU, to defer delivery of the 
    * event channel to a context which can sleep. If it fails for any
    * other reasons, just let it fail silently. The slow path fails 
    * silently too; a warning in that case may be guest triggerable,
    * should never happen anyway, and guests are generally going to
    * *notice* timers going missing.
    */

+		vcpu->arch.xen.timer_expires = 0;
+		return HRTIMER_NORESTART;
+	}

That's documenting *this* code, not the function it happens to call.
It's more verbose than I would normally have bothered to be, but I'm
all for improving the level of commenting in our code as long as it's
adding value. 

Attachment:
smime.p7s

Description: S/MIME cryptographic signature