On Mon, 2023-10-02 at 11:45 -0700, Sean Christopherson wrote: > On Mon, Oct 02, 2023, David Woodhouse wrote: > > On Mon, 2023-10-02 at 10:41 -0700, Sean Christopherson wrote: > > > On Fri, Sep 29, 2023, David Woodhouse wrote: > > > > On Fri, 2023-09-29 at 08:16 -0700, Sean Christopherson wrote: > > > > > On Fri, Sep 29, 2023, David Woodhouse wrote: > > > > > > From: David Woodhouse <dwmw@xxxxxxxxxxxx> > > > > > > > > > > > > Most of the time there's no need to kick the vCPU and deliver the timer > > > > > > event through kvm_xen_inject_timer_irqs(). Use kvm_xen_set_evtchn_fast() > > > > > > directly from the timer callback, and only fall back to the slow path > > > > > > when it's necessary to do so. > > > > > > > > > > It'd be helpful for non-Xen folks to explain "when it's necessary". IIUC, the > > > > > only time it's necessary is if the gfn=>pfn cache isn't valid/fresh. > > > > > > > > That's an implementation detail. > > > > > > And? The target audience of changelogs are almost always people that care about > > > the implementation. > > > > > > > Like all of the fast path functions that can be called from > > > > kvm_arch_set_irq_inatomic(), it has its own criteria for why it might return > > > > -EWOULDBLOCK or not. Those are *its* business. > > > > > > And all of the KVM code is the business of the people who contribute to the kernel, > > > now and in the future. Yeah, there's a small chance that a detailed changelog can > > > become stale if the patch races with some other in-flight change, but even *that* > > > is a useful data point. E.g. if Paul's patches somehow broke/degraded this code, > > > then knowing that what the author (you) intended/observed didn't match reality when > > > the patch was applied would be extremely useful information for whoever encountered > > > the hypothetical breakage. > > > > Fair enough, but on this occasion it truly doesn't matter. It has > > nothing to do with the implementation of *this* patch. This code makes > > no assumptions and has no dependency on *when* that fast path might > > return -EWOULDBLOCK. Sometimes it does, sometimes it doesn't. This code > > just doesn't care one iota. > > > > If this code had *dependencies* on the precise behaviour of > > kvm_xen_set_evtchn_fast() that we needed to reason about, then sure, > > I'd have written those explicitly into the commit comment *and* tried > > to find some way of enforcing them with runtime warnings etc. > > > > But it doesn't. So I am no more inclined to document the precise > > behaviour of kvm_xen_set_evtchn_fast() in a patch which just happens to > > call it, than I am inclined to document hrtimer_cancel() or any other > > function called from the new code :) > > Just because some bit of code doesn't care/differentiate doesn't mean the behavior > of said code is correct. I agree that adding a comment to explain the gory details > is unnecessary and would lead to stale code. But changelogs essentially capture a > single point in a time, and a big role of the changelog is to help reviewers and > readers understand (a) the *intent* of the change and (b) whether or not that change > is correct. > > E.g. there's an assumption that -EWOULDBLOCK is the only non-zero return code where > the correct response is to go down the slow path. > > I'm not asking to spell out every single condition, I'm just asking for clarification > on what the intended behavior is, e.g. > > Use kvm_xen_set_evtchn_fast() directly from the timer callback, and fall > back to the slow path if the event is valid but fast delivery isn't > possible, which currently can only happen if delivery needs to block, > e.g. because the gfn=>pfn cache is invalid or stale. > > instead of simply saying "when it's necessary to do so" and leaving it up to the > reader to figure what _they_ think that means, which might not always align with > what the author actually meant. Fair enough. There's certainly scope for something along the lines of + rc = kvm_xen_set_evtchn_fast(&e, vcpu->kvm); + if (rc != -EWOULDBLOCK) { /* * If kvm_xen_set_evtchn_fast() returned -EWOULDBLOCK, then set the * timer_pending flag and kick the vCPU, to defer delivery of the * event channel to a context which can sleep. If it fails for any * other reasons, just let it fail silently. The slow path fails * silently too; a warning in that case may be guest triggerable, * should never happen anyway, and guests are generally going to * *notice* timers going missing. */ + vcpu->arch.xen.timer_expires = 0; + return HRTIMER_NORESTART; + } That's documenting *this* code, not the function it happens to call. It's more verbose than I would normally have bothered to be, but I'm all for improving the level of commenting in our code as long as it's adding value.
Attachment:
smime.p7s
Description: S/MIME cryptographic signature