On Wed, Aug 18, 2021, Cannon Matthews wrote: > Since a guest has explictly asked for a vcpu to HLT, this is "useful work on > behalf of the guest" even though the thread is "blocked" from running. > > This allows answering questions like, are we spending too much time waiting > on mutexes, or long running kernel routines rather than running the vcpu in > guest mode, or did the guest explictly tell us to not doing anything. > > So I would suggest keeping the "halt" part of the counters' name, and remove > the "blocked" part rather than the other way around. We explicitly do not > want to include non-halt blockages in this. But this patch does include non-halt blockages, which is why I brought up the technically-wrong naming. Specifically, x86 reaches this path for any !RUNNABLE vCPU state, e.g. if the vCPU is in WFS. Non-x86 usage appears to mostly call this for halt-like behavior, but PPC looks like it has at least one path that's not halt-like. I doubt anyone actually cares if the stat is a misnomer in some cases, but at the same time I think there's opportunity for clean up here. E.g. halt polling if a vCPU is in WFS or UNINITIALIZED is a waste of cycles. Ditto for the calls to kvm_arch_vcpu_blocking() and kvm_arch_vcpu_unblocking() when halt polling is successful, e.g. arm64 puts and reloads the vgic, which I assume is a complete waste of cycles if the vCPU doesn't actually block. And kvm_arch_vcpu_block_finish() can be dropped by moving the one line of code into s390, which can add its own wrapper if necessary. So with a bit of massaging and a slight change in tracing behavior, I believe we can isolate the actual wait/halt and avoid "halted" being technically-wrong, and fix some inefficiencies at the same time. Jing, can you do a v2 of this patch and send it to me off-list? With luck, my idea will work and I can fold your patch in, and if not we can always post v2 standalone in a few weeks. E.g. I'm thinking something like... void kvm_vcpu_halt(struct kvm_vcpu *vcpu) { vcpu->stat.generic.halted = 1; if (<halt polling failed>) kvm_vcpu_block(); vcpu->stat.generic.halted = 0; <update halt polling stuff> } void kvm_vcpu_block(struct kvm_vcpu *vcpu) { bool waited = false; ktime_t start, cur; u64 block_ns; start = ktime_get(); prepare_to_rcuwait(&vcpu->wait); for (;;) { set_current_state(TASK_INTERRUPTIBLE); if (kvm_vcpu_check_block(vcpu) < 0) break; waited = true; schedule(); } finish_rcuwait(&vcpu->wait); block_ns = ktime_to_ns(cur) - ktime_to_ns(start); trace_kvm_vcpu_wakeup(block_ns, waited, vcpu_valid_wakeup(vcpu)); }