On Mon, Jan 20, 2025, Kenta Ishiguro wrote: > Thank you for your comments. > I understood the scenario of why my previous change was unsafe. > > Also, I like the concept of KVM_VCPU_IN_HOST for tracking whether a vCPU is > scheduled out because it can be helpful to understand the vCPU's situation > from guests. > > I have tested the attached changes, but I found that the performance > improvement was somewhat limited. The boundary-checking code prevents > KVM from setting KVM_VCPU_IN_HOST and KVM_VCPU_PREEMPTED, which may be > contributing to this limitation. I think this is a conservative > approach to avoid using stale TLB entries. > I referred to this patch: > https://lore.kernel.org/lkml/20220614021116.1101331-1-sashal@xxxxxxxxxx/ > Since PV waiting causes the most significant overhead, is it possible to > allow guests to perform PV flush if vCPUs are PV waiting and scheduled out? Hmm, yes? The "right now kvm_vcpu_check_block() is doing memory accesses" issue was mostly addressed by commit 26844fee6ade ("KVM: x86: never write to memory from kvm_vcpu_check_block()"). It would in principle be okay to report the vCPU as preempted also if it is sleeping in kvm_vcpu_block(): a TLB flush IPI will incur the vmentry/vmexit overhead unnecessarily, and optimistic spinning is also unlikely to succeed. However, leave it for later because right now kvm_vcpu_check_block() is doing memory accesses. Even though the TLB flush issue only applies to virtual memory address, it's very much preferrable to be conservative. I say mostly because there are technically reads to guest memory via cached objects, specifically vmx_has_nested_events()'s use of the nested Posted Interrupt Descriptor (PID). But a vCPU's PID is referenced directly via physical address in the VMCS, so there are no TLB flushing concerns on that front. I think this would be safe/correct. Key word "think". diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 5193c3dfbce1..691cf4cfe5d4 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -2350,16 +2350,6 @@ static inline bool kvm_irq_is_postable(struct kvm_lapic_irq *irq) irq->delivery_mode == APIC_DM_LOWEST); } -static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) -{ - kvm_x86_call(vcpu_blocking)(vcpu); -} - -static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) -{ - kvm_x86_call(vcpu_unblocking)(vcpu); -} - static inline int kvm_cpu_get_apicid(int mps_cpu) { #ifdef CONFIG_X86_LOCAL_APIC diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 5f3ad13a8ac7..4f8d0b000e93 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -11170,6 +11170,18 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu) return kvm_vcpu_running(vcpu) || kvm_vcpu_has_events(vcpu); } +void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) +{ + vcpu->arch.at_instruction_boundary = true; + kvm_x86_call(vcpu_blocking)(vcpu); +} + +void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) +{ + kvm_x86_call(vcpu_unblocking)(vcpu); + vcpu->arch.at_instruction_boundary = false; +} + /* Called within kvm->srcu read side. */ static inline int vcpu_block(struct kvm_vcpu *vcpu) {