Apologies for the slow response. On Wed, Apr 21, 2021, Frederic Weisbecker wrote: > On Tue, Apr 20, 2021 at 11:26:34PM +0000, Sean Christopherson wrote: > > On Wed, Apr 21, 2021, Frederic Weisbecker wrote: > > > On Thu, Apr 15, 2021 at 03:21:00PM -0700, Sean Christopherson wrote: > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > > > index 16fb39503296..e4d475df1d4a 100644 > > > > --- a/arch/x86/kvm/x86.c > > > > +++ b/arch/x86/kvm/x86.c > > > > @@ -9230,6 +9230,14 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) > > > > local_irq_disable(); > > > > kvm_after_interrupt(vcpu); > > > > > > > > + /* > > > > + * When using tick-based accounting, wait until after servicing IRQs to > > > > + * account guest time so that any ticks that occurred while running the > > > > + * guest are properly accounted to the guest. > > > > + */ > > > > + if (!vtime_accounting_enabled_this_cpu()) > > > > + vtime_account_guest_exit(); > > > > > > Can we rather have instead: > > > > > > static inline void tick_account_guest_exit(void) > > > { > > > if (!vtime_accounting_enabled_this_cpu()) > > > current->flags &= ~PF_VCPU; > > > } > > > > > > It duplicates a bit of code but I think this will read less confusing. > > > > Either way works for me. I used vtime_account_guest_exit() to try to keep as > > many details as possible inside vtime, e.g. in case the implemenation is tweaked > > in the future. But I agree that pretending KVM isn't already deeply intertwined > > with the details is a lie. > > Ah I see, before 87fa7f3e98a131 the vtime was accounted after interrupts get > processed. So it used to work until then. I see that ARM64 waits for IRQs to > be enabled too. > > PPC/book3s_hv, MIPS, s390 do it before IRQs get re-enabled (weird, how does that > work?) No idea. It's entirely possible it doesn't work on one or more of those architectures. Based on init/Kconfig, s390 doesn't support tick-based accounting, so I assume s390 is ok. config TICK_CPU_ACCOUNTING bool "Simple tick based cputime accounting" depends on !S390 && !NO_HZ_FULL > And PPC/book3s_pr calls guest_exit() so I guess it has interrupts enabled. > > The point is: does it matter to call vtime_account_guest_exit() before or > after interrupts? If it doesn't matter, we can simply call > vtime_account_guest_exit() once and for all once IRQs are re-enabled. > > If it does matter because we don't want to account the host IRQs firing at the > end of vcpu exit, then probably we should standardize that behaviour and have > guest_exit_vtime() called before interrupts get enabled and guest_exit_tick() > called after interrupts get enabled. It's probably then beyond the scope of this > patchset but I would like to poke your opinion on that. > > Thanks. I don't know. For x86, I would be ok with simply moving the call to vtime_account_guest_exit() to after IRQs are enabled. It would bug me a little bit that KVM _could_ be more precise when running with CONFIG_VIRT_CPU_ACCOUNTING_GEN=y, and KVM would still be poking into the details of vtime_account_guest_exit() to some extent, but overall it would be an improvement from a code cleanliness perspective. The problem is I have no clue who, if anyone, deploys KVM on x86 with CONFIG_VIRT_CPU_ACCOUNTING_GEN=y. On the other hand, AMD/SVM has always had the "inaccurate" accounting, and Intel/VMX has been inaccurate since commit d7a08882a0a4 ("KVM: x86: Unconditionally enable irqs in guest context"), which amusingly was a fix for an edge case in tick-based accounting. Anyone have an opinion either way? I'm very tempted to go with Frederic's suggestion of moving the time accounting back to where it was, it makes KVM just a little less ugly.