On Tue, Aug 17, 2021, Jing Zhang wrote: > Current guest/host/halt stats don't show when we are currently halting s/we are/KVM is And I would probably reword it to "when KVM is blocking a vCPU in response to the vCPU activity state, e.g. halt". More on that below. > well. If a guest halts for a long period of time they could appear > pathologically blocked but really it's the opposite there's nothing to > do. > Simply count the number of times we enter and leave the kvm_vcpu_block s/we/KVM In general, it's good practice to avoid pronouns in comments and changelogs as doing so all but forces using precise, unambiguous language. Things like 'it' and 'they' are ok when it's abundantly clear what they refer to, but 'we' and 'us' are best avoided entirely. > function per vcpu, if they are unequal, then a VCPU is currently > halting. > The existing stats like halt_exits and halt_wakeups don't quite capture > this. The time spend halted and halt polling is reported eventually, but > not until we wakeup and resume. If a guest were to indefinitely halt one > of it's CPUs we would never know, it may simply appear blocked. ^^^^ ^^ its userspace? The "blocked" terminology is a bit confusing since KVM is explicitly blocking the vCPU, it just happens to mostly do so in response to a guest HLT. I think "block" is intended to mean "vCPU task not run", but it would be helpful to make that clear. > Original-by: Cannon Matthews <cannonmatthews@xxxxxxxxxx> > Signed-off-by: Jing Zhang <jingzhangos@xxxxxxxxxx> > --- > include/linux/kvm_host.h | 4 +++- > include/linux/kvm_types.h | 2 ++ > virt/kvm/kvm_main.c | 2 ++ > 3 files changed, 7 insertions(+), 1 deletion(-) > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > index d447b21cdd73..23d2e19af3ce 100644 > --- a/include/linux/kvm_host.h > +++ b/include/linux/kvm_host.h > @@ -1459,7 +1459,9 @@ struct _kvm_stats_desc { > STATS_DESC_LOGHIST_TIME_NSEC(VCPU_GENERIC, halt_poll_fail_hist, \ > HALT_POLL_HIST_COUNT), \ > STATS_DESC_LOGHIST_TIME_NSEC(VCPU_GENERIC, halt_wait_hist, \ > - HALT_POLL_HIST_COUNT) > + HALT_POLL_HIST_COUNT), \ > + STATS_DESC_COUNTER(VCPU_GENERIC, halt_block_starts), \ > + STATS_DESC_COUNTER(VCPU_GENERIC, halt_block_ends) Why two counters? It's per-vCPU, can't this just be a "blocked" flag or so? I get that all the other stats use "halt", but that's technically wrong as KVM will block vCPUs that are not runnable for other reason, e.g. because they're in WFS on x86.