On Tue, Nov 28, 2017 at 09:42:41AM +0100, Paolo Bonzini wrote: > On 28/11/2017 01:15, Jan H. Schönherr wrote: > > On 11/27/2017 12:46 PM, Paolo Bonzini wrote: > >> On 25/11/2017 14:09, Jan H. Schönherr wrote: > >>> KVM no longer intercepts MWAIT instructions since Linux 4.12 commit 668fffa3f838 > >>> ("kvm: better MWAIT emulation for guests") for improved latency in some > >>> workloads. > >>> > >>> This series takes the idea further and makes the interception of HLT (patch 2) > >>> and PAUSE (patch 3) optional as well for similar reasons. Both interceptions > >>> can be turned off for an individual VM by enabling the corresponding capability. > >>> > >>> It also converts KVM_CAP_X86_GUEST_MWAIT into an initially disabled capability > >>> that has to be enabled explicitly for a VM. This restores pre Linux 4.12 > >>> behavior in the default case, so that a guest cannot put CPUs into low power > >>> states, which may exceed the host's latency requirements (patch 1). > >> > >> Nice! Regarding the userspace ABI, we could have a single capability > >> KVM_CAP_X86_DISABLE_EXITS that is just KVM_CAP_X86_GUEST_MWAIT renamed. > >> The value returned by KVM_CHECK_EXTENSION would be defined like this: > >> > >> - if bit 16 is 0, bit 0..15 says which exits are disabled > >> > >> - if bit 16 is 1, no exits are disabled by default but the capability > >> supports KVM_ENABLE_CAP > > > > ... and bits 0..15 indicate which exits can be disabled by specifying > > them as argument to KVM_ENABLE_CAP? > > Yes. > > >> With > >> > >> #define KVM_X86_DISABLE_EXITS_MWAIT 1 > >> #define KVM_X86_DISABLE_EXITS_HLT 2 > >> #define KVM_X86_DISABLE_EXITS_PAUSE 4 > >> #define KVM_X86_DISABLE_EXITS_WITH_ENABLE 0x10000 > > > > Is that bit 16 an attempt at backwards compatibility with the current state? > > That would only work, if any potential user actually checks with "==1" instead > > of "!=0". > > Fair enough. Let's get rid of KVM_CAP_X86_GUEST_MWAIT altogether and > add a new capability without bit 16. But let's use just one. > > > What is the benefit of doing this with bitmasks as opposed to separate > > capabilities? > > The three capabilities are more or less all doing the same thing. > Perhaps it would make some sense to only leave PAUSE spin loops in > guest, but not HLT/MWAIT; but apart from that I think users would > probably enable all of them. I am not sure I agree. I think guests still want some way to halt when giving up CPU for a long time. If you are not worried about guests entering low power states, then you only need MWAIT and maybe PAUSE. HLT within guest only makes sense if you do not want to allow guest to enter power state. If you don't exit on any of these, you want some other way to actually halt the VCPU. Maybe add an IO register for that. > So I think we should put in the > documentation that blindly passing the KVM_CHECK_EXTENSION result to > KVM_ENABLE_CAP is a valid thing to do when vCPUs are associated to > dedicated physical CPUs. > > Paolo