[PATCH RFC 4/6] KVM: x86: acknowledgment mechanism for async pf page ready notifications

Vitaly Kuznetsov <vkuznets@xxxxxxxxxx> · Wed, 29 Apr 2020 11:36:32 +0200

If two page ready notifications happen back to back the second one is not
delivered and the only mechanism we currently have is
kvm_check_async_pf_completion() check in vcpu_run() loop. The check will
only be performed with the next vmexit when it happens and in some cases
it may take a while. With interrupt based page ready notification delivery
the situation is even worse: unlike exceptions, interrupts are not handled
immediately so we must check if the slot is empty. This is slow and
unnecessary. Introduce dedicated MSR_KVM_ASYNC_PF_ACK MSR to communicate
the fact that the slot is free and host should check its notification
queue. Mandate using it for interrupt based type 2 APF event delivery.

Signed-off-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>
---
 Documentation/virt/kvm/msr.rst       | 16 +++++++++++++++-
 arch/x86/include/uapi/asm/kvm_para.h |  1 +
 arch/x86/kvm/x86.c                   |  9 ++++++++-
 3 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/Documentation/virt/kvm/msr.rst b/Documentation/virt/kvm/msr.rst
index 7433e55f7184..18db3448db06 100644
--- a/Documentation/virt/kvm/msr.rst
+++ b/Documentation/virt/kvm/msr.rst
@@ -219,6 +219,11 @@ data:
 	If during pagefault APF reason is 0 it means that this is regular
 	page fault.
 
+	For interrupt based delivery, guest has to write '1' to
+	MSR_KVM_ASYNC_PF_ACK every time it clears reason in the shared
+	'struct kvm_vcpu_pv_apf_data', this forces KVM to re-scan its
+	queue and deliver next pending notification.
+
 	During delivery of type 1 APF cr2 contains a token that will
 	be used to notify a guest when missing page becomes
 	available. When page becomes available type 2 APF is sent with
@@ -340,4 +345,13 @@ data:
 
 	To switch to interrupt based delivery of type 2 APF events guests
 	are supposed to enable asynchronous page faults and set bit 3 in
-	MSR_KVM_ASYNC_PF_EN first.
+
+MSR_KVM_ASYNC_PF_ACK:
+	0x4b564d07
+
+data:
+	Asynchronous page fault acknowledgment. When the guest is done
+	processing type 2 APF event and 'reason' field in 'struct
+	kvm_vcpu_pv_apf_data' is cleared it is supposed to write '1' to
+	Bit 0 of the MSR, this caused the host to re-scan its queue and
+	check if there are more notifications pending.
diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
index 1bbb0b7e062f..5c7449980619 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -51,6 +51,7 @@
 #define MSR_KVM_PV_EOI_EN      0x4b564d04
 #define MSR_KVM_POLL_CONTROL	0x4b564d05
 #define MSR_KVM_ASYNC_PF2	0x4b564d06
+#define MSR_KVM_ASYNC_PF_ACK	0x4b564d07
 
 struct kvm_steal_time {
 	__u64 steal;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 861dce1e7cf5..e3b91ac33bfd 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1243,7 +1243,7 @@ static const u32 emulated_msrs_all[] = {
 	HV_X64_MSR_TSC_EMULATION_STATUS,
 
 	MSR_KVM_ASYNC_PF_EN, MSR_KVM_STEAL_TIME,
-	MSR_KVM_PV_EOI_EN, MSR_KVM_ASYNC_PF2,
+	MSR_KVM_PV_EOI_EN, MSR_KVM_ASYNC_PF2, MSR_KVM_ASYNC_PF_ACK,
 
 	MSR_IA32_TSC_ADJUST,
 	MSR_IA32_TSCDEADLINE,
@@ -2915,6 +2915,10 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		if (kvm_pv_enable_async_pf2(vcpu, data))
 			return 1;
 		break;
+	case MSR_KVM_ASYNC_PF_ACK:
+		if (data & 0x1)
+			kvm_check_async_pf_completion(vcpu);
+		break;
 	case MSR_KVM_STEAL_TIME:
 
 		if (unlikely(!sched_info_on()))
@@ -3194,6 +3198,9 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	case MSR_KVM_ASYNC_PF2:
 		msr_info->data = vcpu->arch.apf.msr2_val;
 		break;
+	case MSR_KVM_ASYNC_PF_ACK:
+		msr_info->data = 0;
+		break;
 	case MSR_KVM_STEAL_TIME:
 		msr_info->data = vcpu->arch.st.msr_val;
 		break;
-- 
2.25.3