Hi Sean, On Wed, Aug 18, 2021 at 10:01 AM Cannon Matthews <cannonmatthews@xxxxxxxxxx> wrote: > > +1 to the rephrasing of the commit message. > > On Wed, Aug 18, 2021 at 11:09 AM Jing Zhang <jingzhangos@xxxxxxxxxx> wrote: > > > > Hi Sean, > > > > On Tue, Aug 17, 2021 at 4:46 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > > > > > On Tue, Aug 17, 2021, Jing Zhang wrote: > > > > Current guest/host/halt stats don't show when we are currently halting > > > > > > s/we are/KVM is > > > > > > And I would probably reword it to "when KVM is blocking a vCPU in response to > > > the vCPU activity state, e.g. halt". More on that below. > > > > > > > well. If a guest halts for a long period of time they could appear > > > > pathologically blocked but really it's the opposite there's nothing to > > > > do. > > > > Simply count the number of times we enter and leave the kvm_vcpu_block > > > > > > s/we/KVM > > > > > > In general, it's good practice to avoid pronouns in comments and changelogs as > > > doing so all but forces using precise, unambiguous language. Things like 'it' > > > and 'they' are ok when it's abundantly clear what they refer to, but 'we' and 'us' > > > are best avoided entirely. > > > > > > > function per vcpu, if they are unequal, then a VCPU is currently > > > > halting. > > > > The existing stats like halt_exits and halt_wakeups don't quite capture > > > > this. The time spend halted and halt polling is reported eventually, but > > > > not until we wakeup and resume. If a guest were to indefinitely halt one > > > > of it's CPUs we would never know, it may simply appear blocked. > > > ^^^^ ^^ > > > its userspace? > > > > > > > > > The "blocked" terminology is a bit confusing since KVM is explicitly blocking > > > the vCPU, it just happens to mostly do so in response to a guest HLT. I think > > > "block" is intended to mean "vCPU task not run", but it would be helpful to make > > > that clear. > > > > > That's a good point. Will reword the comments as you suggested. > > > > Original-by: Cannon Matthews <cannonmatthews@xxxxxxxxxx> > > > > Signed-off-by: Jing Zhang <jingzhangos@xxxxxxxxxx> > > > > --- > > > > include/linux/kvm_host.h | 4 +++- > > > > include/linux/kvm_types.h | 2 ++ > > > > virt/kvm/kvm_main.c | 2 ++ > > > > 3 files changed, 7 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > > > > index d447b21cdd73..23d2e19af3ce 100644 > > > > --- a/include/linux/kvm_host.h > > > > +++ b/include/linux/kvm_host.h > > > > @@ -1459,7 +1459,9 @@ struct _kvm_stats_desc { > > > > STATS_DESC_LOGHIST_TIME_NSEC(VCPU_GENERIC, halt_poll_fail_hist, \ > > > > HALT_POLL_HIST_COUNT), \ > > > > STATS_DESC_LOGHIST_TIME_NSEC(VCPU_GENERIC, halt_wait_hist, \ > > > > - HALT_POLL_HIST_COUNT) > > > > + HALT_POLL_HIST_COUNT), \ > > > > + STATS_DESC_COUNTER(VCPU_GENERIC, halt_block_starts), \ > > > > + STATS_DESC_COUNTER(VCPU_GENERIC, halt_block_ends) > > > > > > Why two counters? It's per-vCPU, can't this just be a "blocked" flag or so? I > > > get that all the other stats use "halt", but that's technically wrong as KVM will > > > block vCPUs that are not runnable for other reason, e.g. because they're in WFS > > > on x86. > > The point is to separate "blocked because not runable" and "guest > explicitly asked to do nothing" > > IIRC I originally wrote this patch to help discriminate how we spent > VCPU thread time, > in particular into two categories of essentially "doing useful work > on behalf of the guest" and > "blocked from doing useful work." > > Since a guest has explictly asked for a vcpu to HLT, this is "useful > work on behalf of the guest" > even though the thread is "blocked" from running. > > This allows answering questions like, are we spending too much time > waiting on mutexes, or > long running kernel routines rather than running the vcpu in guest > mode, or did the guest explictly > tell us to not doing anything. > > So I would suggest keeping the "halt" part of the counters' name, and > remove the "blocked" part > rather than the other way around. We explicitly do not want to include > non-halt blockages in this. > > That being said I suppose a boolean could work as well. I think we did > this because it worked with > and mirrored existing stats rather than anything particularly nuanced. > Though there might be some > eventual consistency sort of concerns with how these stats are updated > and exported that could make > monotonic increasing counters more useful. Any more comments on the naming and the increasing counters? > > > The two counters are used to determine the reason why vCPU is not > > running. If the halt_block_ends is one less than halt_block_starts, > > then we know the vCPU is explicitly blocked by KVM. Otherwise, we know > > there might be something wrong with the vCPU. Does this make sense? > > Will rename from "halt_block_*" to "vcpu_block_*". > > > > Thanks, > > Jing