Re: [PATCH v19 110/130] KVM: TDX: Handle TDX PV MMIO hypercall

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 7/17/2024 6:25 AM, Isaku Yamahata wrote:
On Tue, Jun 25, 2024 at 02:54:09PM +0800,
Binbin Wu <binbin.wu@xxxxxxxxxxxxxxx> wrote:


On 2/26/2024 4:26 PM, isaku.yamahata@xxxxxxxxx wrote:
From: Sean Christopherson <sean.j.christopherson@xxxxxxxxx>

Export kvm_io_bus_read and kvm_mmio tracepoint and wire up TDX PV MMIO
hypercall to the KVM backend functions.

kvm_io_bus_read/write() searches KVM device emulated in kernel of the given
MMIO address and emulates the MMIO.  As TDX PV MMIO also needs it, export
kvm_io_bus_read().  kvm_io_bus_write() is already exported.  TDX PV MMIO
emulates some of MMIO itself.  To add trace point consistently with x86
kvm, export kvm_mmio tracepoint.

Signed-off-by: Sean Christopherson <sean.j.christopherson@xxxxxxxxx>
Signed-off-by: Isaku Yamahata <isaku.yamahata@xxxxxxxxx>
Reviewed-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>
---
   arch/x86/kvm/vmx/tdx.c | 114 +++++++++++++++++++++++++++++++++++++++++
   arch/x86/kvm/x86.c     |   1 +
   virt/kvm/kvm_main.c    |   2 +
   3 files changed, 117 insertions(+)

diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
index 55fc6cc6c816..389bb95d2af0 100644
--- a/arch/x86/kvm/vmx/tdx.c
+++ b/arch/x86/kvm/vmx/tdx.c
@@ -1217,6 +1217,118 @@ static int tdx_emulate_io(struct kvm_vcpu *vcpu)
   	return ret;
   }
+static int tdx_complete_mmio(struct kvm_vcpu *vcpu)
+{
+	unsigned long val = 0;
+	gpa_t gpa;
+	int size;
+
+	KVM_BUG_ON(vcpu->mmio_needed != 1, vcpu->kvm);
+	vcpu->mmio_needed = 0;
mmio_needed is used by instruction emulator to setup the complete callback.
Since TDX handle MMIO in a PV way, mmio_needed is not needed here.
Ok, we don't need to update mmio_needed.


+
+	if (!vcpu->mmio_is_write) {
It's also needed by instruction emulator, we can use
vcpu->run->mmio.is_write instead.
No because vcpu->run->mmio is shared with user space.  KVM need to stash
it independently.


+		gpa = vcpu->mmio_fragments[0].gpa;
+		size = vcpu->mmio_fragments[0].len;
Since MMIO cross page boundary is not allowed according to the input checks
from TDVMCALL, these mmio_fragments[] is not needed.
Just use vcpu->run->mmio.phys_addr and vcpu->run->mmio.len?
ditto.


+
+		memcpy(&val, vcpu->run->mmio.data, size);
+		tdvmcall_set_return_val(vcpu, val);
+		trace_kvm_mmio(KVM_TRACE_MMIO_READ, size, gpa, &val);
+	}
Tracepoint for KVM_TRACE_MMIO_WRITE is missing when it is handled in
userspace.
tdx_mmio_write() has it before existing to the user space.  It matches with
how write_mmio() behaves in x86.c.

Hmm, to match with other code, we should remove
trace_kvm_mmio(KVM_TRACE_MMIO_READ) and keep KVM_TRACE_MMIO_READ_UNSATISFIED
in tdx_emulate_mmio().  That's how read_prepare() and read_exit_mmio() behaves.

For MMIO read
- When kernel can handle the MMIO, KVM_TRACE_MMIO_READ with data.
- When exiting to the user space, KVM_TRACE_MMIO_READ_UNSATISFIED before
   the exit.  No trace after the user space handled the MMIO.

For MMIO read, in the emulator, there is still a trace after the userspace handled the MMIO.
In complete_emulated_mmio(), if all fragments have been handled, it will
set vcpu->mmio_read_completed to 1 and call complete_emulated_io().
complete_emulated_io
    kvm_emulate_instruction(vcpu, EMULTYPE_NO_DECODE)
        x86_emulate_instruction
            x86_emulate_insn
                emulator_read_write
                    read_prepare
                        At this point, vcpu->mmio_read_completed is 1,
                        it traces KVM_TRACE_MMIO_READ with data
                        and then clear vcpu->mmio_read_completed

So to align with emulator, we should keep the trace for KVM_TRACE_MMIO_READ.



For MMIO write
- KVM_TRACE_MMIO_WRITE before handling it.


Also, the return code is only set when the emulation is done in kernel, but
not set when it's handled in userspace.

+	return 1;
+}
How about the fixup as following:

@@ -1173,19 +1173,18 @@ static int tdx_emulate_io(struct kvm_vcpu *vcpu)
static int tdx_complete_mmio(struct kvm_vcpu *vcpu) { unsigned long val = 0;
- gpa_t gpa; - int size; - - vcpu->mmio_needed = 0; - - if
(!vcpu->mmio_is_write) { - gpa = vcpu->mmio_fragments[0].gpa; - size =
vcpu->mmio_fragments[0].len; + gpa_t gpa = vcpu->run->mmio.phys_addr; + int
size = vcpu->run->mmio.len; + if (vcpu->run->mmio.is_write) { +
trace_kvm_mmio(KVM_TRACE_MMIO_WRITE, size, gpa, &val); + } else {
memcpy(&val, vcpu->run->mmio.data, size); tdvmcall_set_return_val(vcpu,
val); trace_kvm_mmio(KVM_TRACE_MMIO_READ, size, gpa, &val); } + +
tdvmcall_set_return_code(vcpu, TDVMCALL_SUCCESS); return 1; }



+
+static inline int tdx_mmio_write(struct kvm_vcpu *vcpu, gpa_t gpa, int size,
+				 unsigned long val)
+{
+	if (kvm_iodevice_write(vcpu, &vcpu->arch.apic->dev, gpa, size, &val) &&
+	    kvm_io_bus_write(vcpu, KVM_MMIO_BUS, gpa, size, &val))
+		return -EOPNOTSUPP;
+
+	trace_kvm_mmio(KVM_TRACE_MMIO_WRITE, size, gpa, &val);
+	return 0;
+}
+
+static inline int tdx_mmio_read(struct kvm_vcpu *vcpu, gpa_t gpa, int size)
+{
+	unsigned long val;
+
+	if (kvm_iodevice_read(vcpu, &vcpu->arch.apic->dev, gpa, size, &val) &&
+	    kvm_io_bus_read(vcpu, KVM_MMIO_BUS, gpa, size, &val))
+		return -EOPNOTSUPP;
+
+	tdvmcall_set_return_val(vcpu, val);
+	trace_kvm_mmio(KVM_TRACE_MMIO_READ, size, gpa, &val);
+	return 0;
+}
+
+static int tdx_emulate_mmio(struct kvm_vcpu *vcpu)
+{
+	struct kvm_memory_slot *slot;
+	int size, write, r;
+	unsigned long val;
+	gpa_t gpa;
+
+	KVM_BUG_ON(vcpu->mmio_needed, vcpu->kvm);
+
[...]
+
+	/* Request the device emulation to userspace device model. */
+	vcpu->mmio_needed = 1;
+	vcpu->mmio_is_write = write;
Then they can be dropped.
We may drop mmio_needed. mmio_is_write is needed as above.



+	vcpu->arch.complete_userspace_io = tdx_complete_mmio;
+
+	vcpu->run->mmio.phys_addr = gpa;
+	vcpu->run->mmio.len = size;
+	vcpu->run->mmio.is_write = write;
+	vcpu->run->exit_reason = KVM_EXIT_MMIO;
+
+	if (write) {
+		memcpy(vcpu->run->mmio.data, &val, size);
+	} else {
+		vcpu->mmio_fragments[0].gpa = gpa;
+		vcpu->mmio_fragments[0].len = size;
These two lines can be dropped as well.
ditto.





[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux