On 2021/2/21 8:04 下午, Elena Afanasova wrote:
The vCPU thread may receive a signal during ioregionfd communication,
ioctl(KVM_RUN) needs to return to userspace and then ioctl(KVM_RUN)
must resume ioregionfd.
After a glance at the patch, I wonder can we split the patch into two?
1) sleepable iodevice which is not supported currently, probably with a
new cap?
2) ioregionfd specific codes (I wonder if it has any)
Then the sleepable iodevice could be reused by future features.
Signed-off-by: Elena Afanasova <eafanasova@xxxxxxxxx>
---
v3:
- add FAST_MMIO bus support
- move ioregion_interrupted flag to ioregion_ctx
- reorder ioregion_ctx fields
- rework complete_ioregion operations
- add signal handling support for crossing a page boundary case
- fix kvm_arch_vcpu_ioctl_run() should return -EINTR in case ioregionfd
is interrupted
arch/x86/kvm/vmx/vmx.c | 40 +++++-
arch/x86/kvm/x86.c | 272 +++++++++++++++++++++++++++++++++++++--
include/linux/kvm_host.h | 10 ++
virt/kvm/kvm_main.c | 16 ++-
4 files changed, 317 insertions(+), 21 deletions(-)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 47b8357b9751..39db31afd27e 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -5357,19 +5357,51 @@ static int handle_ept_violation(struct kvm_vcpu *vcpu)
return kvm_mmu_page_fault(vcpu, gpa, error_code, NULL, 0);
}
+#ifdef CONFIG_KVM_IOREGION
+static int complete_ioregion_fast_mmio(struct kvm_vcpu *vcpu)
+{
+ int ret, idx;
+
+ idx = srcu_read_lock(&vcpu->kvm->srcu);
+ ret = kvm_io_bus_write(vcpu, KVM_FAST_MMIO_BUS,
+ vcpu->ioregion_ctx.addr, 0, NULL);
+ if (ret) {
+ ret = kvm_mmu_page_fault(vcpu, vcpu->ioregion_ctx.addr,
+ PFERR_RSVD_MASK, NULL, 0);
+ srcu_read_unlock(&vcpu->kvm->srcu, idx);
+ return ret;
+ }
+
+ srcu_read_unlock(&vcpu->kvm->srcu, idx);
+ return kvm_skip_emulated_instruction(vcpu);
+}
+#endif
+
static int handle_ept_misconfig(struct kvm_vcpu *vcpu)
{
gpa_t gpa;
+ int ret;
/*
* A nested guest cannot optimize MMIO vmexits, because we have an
* nGPA here instead of the required GPA.
*/
gpa = vmcs_read64(GUEST_PHYSICAL_ADDRESS);
- if (!is_guest_mode(vcpu) &&
- !kvm_io_bus_write(vcpu, KVM_FAST_MMIO_BUS, gpa, 0, NULL)) {
- trace_kvm_fast_mmio(gpa);
- return kvm_skip_emulated_instruction(vcpu);
+ if (!is_guest_mode(vcpu)) {
+ ret = kvm_io_bus_write(vcpu, KVM_FAST_MMIO_BUS, gpa, 0, NULL);
+ if (!ret) {
+ trace_kvm_fast_mmio(gpa);
+ return kvm_skip_emulated_instruction(vcpu);
+ }
+
+#ifdef CONFIG_KVM_IOREGION
+ if (unlikely(vcpu->ioregion_ctx.is_interrupted && ret == -EINTR)) {
So the question still, EINTR looks wrong which means the syscall can't
be restarted. Not that the syscal doesn't mean KVM_RUN but actually the
kernel_read|write() you want to do with the ioregion fd.
Also do we need to treat differently for EINTR and ERESTARTSYS since
EINTR means the kernel_read()|write() can't be resumed.
Thanks