Re: [RFC v3 2/5] KVM: x86: add support for ioregionfd signal handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




在 2021/3/17 下午10:19, Elena Afanasova 写道:
On Tue, 2021-03-09 at 13:51 +0800, Jason Wang wrote:
On 2021/2/21 8:04 下午, Elena Afanasova wrote:
The vCPU thread may receive a signal during ioregionfd
communication,
ioctl(KVM_RUN) needs to return to userspace and then ioctl(KVM_RUN)
must resume ioregionfd.
After a glance at the patch, I wonder can we split the patch into
two?

1) sleepable iodevice which is not supported currently, probably with
a
new cap?
2) ioregionfd specific codes (I wonder if it has any)

Then the sleepable iodevice could be reused by future features.

Do you have an idea of another possible use cases?


I don't but iodevice is a genreal infrastrucutre, having a dedicated patch for that helps lot e.g for reviewing.


Could you please
describe your idea in more details?


So it's something like:

patch 1: to introduce the sleepable iodevce infrastructure
patch 2: add ioregionfd on top



Signed-off-by: Elena Afanasova <eafanasova@xxxxxxxxx>
---
v3:
   - add FAST_MMIO bus support
   - move ioregion_interrupted flag to ioregion_ctx
   - reorder ioregion_ctx fields
   - rework complete_ioregion operations
   - add signal handling support for crossing a page boundary case
   - fix kvm_arch_vcpu_ioctl_run() should return -EINTR in case
ioregionfd
     is interrupted

   arch/x86/kvm/vmx/vmx.c   |  40 +++++-
   arch/x86/kvm/x86.c       | 272
+++++++++++++++++++++++++++++++++++++--
   include/linux/kvm_host.h |  10 ++
   virt/kvm/kvm_main.c      |  16 ++-
   4 files changed, 317 insertions(+), 21 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 47b8357b9751..39db31afd27e 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -5357,19 +5357,51 @@ static int handle_ept_violation(struct
kvm_vcpu *vcpu)
   	return kvm_mmu_page_fault(vcpu, gpa, error_code, NULL, 0);
   }
+#ifdef CONFIG_KVM_IOREGION
+static int complete_ioregion_fast_mmio(struct kvm_vcpu *vcpu)
+{
+	int ret, idx;
+
+	idx = srcu_read_lock(&vcpu->kvm->srcu);
+	ret = kvm_io_bus_write(vcpu, KVM_FAST_MMIO_BUS,
+			       vcpu->ioregion_ctx.addr, 0, NULL);
+	if (ret) {
+		ret = kvm_mmu_page_fault(vcpu, vcpu->ioregion_ctx.addr,
+					 PFERR_RSVD_MASK, NULL, 0);
+		srcu_read_unlock(&vcpu->kvm->srcu, idx);
+		return ret;
+	}
+
+	srcu_read_unlock(&vcpu->kvm->srcu, idx);
+	return kvm_skip_emulated_instruction(vcpu);
+}
+#endif
+
   static int handle_ept_misconfig(struct kvm_vcpu *vcpu)
   {
   	gpa_t gpa;
+	int ret;
/*
   	 * A nested guest cannot optimize MMIO vmexits, because we have
an
   	 * nGPA here instead of the required GPA.
   	 */
   	gpa = vmcs_read64(GUEST_PHYSICAL_ADDRESS);
-	if (!is_guest_mode(vcpu) &&
-	    !kvm_io_bus_write(vcpu, KVM_FAST_MMIO_BUS, gpa, 0, NULL)) {
-		trace_kvm_fast_mmio(gpa);
-		return kvm_skip_emulated_instruction(vcpu);
+	if (!is_guest_mode(vcpu)) {
+		ret = kvm_io_bus_write(vcpu, KVM_FAST_MMIO_BUS, gpa, 0,
NULL);
+		if (!ret) {
+			trace_kvm_fast_mmio(gpa);
+			return kvm_skip_emulated_instruction(vcpu);
+		}
+
+#ifdef CONFIG_KVM_IOREGION
+		if (unlikely(vcpu->ioregion_ctx.is_interrupted && ret
== -EINTR)) {
So the question still, EINTR looks wrong which means the syscall
can't
be restarted. Not that the syscal doesn't mean KVM_RUN but actually
the
kernel_read|write() you want to do with the ioregion fd.

Also do we need to treat differently for EINTR and ERESTARTSYS since
EINTR means the kernel_read()|write() can't be resumed.

Thanks

I don’t mind replacing EINTR with ERESTARTSYS. I think in this case
there is no more need to process EINTR for ioregionfd. Also it seems
that the QEMU code doesn’t support ERESTARTSYS handling.


So ERESTARTSYS is something that should not be seen by userspace. When SA_RESTART is not set for the signal, kernel will return -EINTR.

My understanding is that you want to implement a mandated restsart the read()/write() syscall which should be fine. The problem is not all the read()/write() can be restarted (the read/write that doesn't return -ERESTARTSYS). In that case we need fail the VCPU_RUN probably.


Can something
like (run_ret == -EINTR || run_ret == -EAGAIN || run_ret ==
-ERESTARTSYS) in kvm_cpu_exec help in this case?


I can't git grep kvm_cpu_exec in the source.

Thanks



Thank you






[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux