Re: [PATCH] KVM: x86: Reset RSP before exiting to userspace when emulating POPA

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jul 24, 2024 at 08:33:48AM -0700, Sean Christopherson wrote:
> On Wed, Jul 24, 2024, Tao Su wrote:
> > When emulating POPA and exiting to userspace for MMIO, reset modified RSP
> > as emulation context may not be reset. POPA may generate more multiple
> > reads, i.e. multiple POPs from the stack, but if stack points to MMIO,
> > KVM needs to emulate multiple MMIO reads.
> > 
> > When one MMIO done, POPA may be re-emulated with EMULTYPE_NO_DECODE set,
> > i.e. ctxt will not be reset, but RSP is modified by previous emulation of
> > current POPA instruction, which eventually leads to emulation error.
> > 
> > The commit 0dc902267cb3 ("KVM: x86: Suppress pending MMIO write exits if
> > emulator detects exception") provides a detailed analysis of how KVM
> > emulates multiple MMIO reads, and its correctness can be verified in the
> > POPA instruction with this patch.
> 
> I don't see how this can work.  If POPA is reading from MMIO, it will need to
> do 8 distinct emulated MMIO accesses.  Unwinding to the original RSP will allow
> the first MMIO (store to EDI) to succeed, but then the second MMIO (store to ESI)
> will exit back to userspace.  And the second restart will load EDI with the
> result of the MMIO, not ESI.  It will also re-trigger the second MMIO indefinitely.
> 

This can work like commit 0dc902267cb3 description. Since there is a buffer
(ctxt->mem_read), KVM can safely restart the instruction from the beginning.

After the first MMIO (store to EDI) done, vcpu->mmio_read_completed is set and
POPA is re-emulated with EMULTYPE_NO_DECODE. When re-emulating, KVM will try
first MMIO (store to EDI) _again_, but KVM will not exit to userspace in
emulator_read_write() because vcpu->mmio_read_completed is set and then adjust
mc->end in read_emulated().

For the second MMIO (store to ESI), it will exit to userspace and re-emulate.
When re-emulating, KVM can directly read EDI from the buffer (mc->data) and try
second MMIO (store to ESI) _again_, but KVM will not exit to userspace in
emulator_read_write() because vcpu->mmio_read_completed is set and then adjust
mc->end in read_emulated().

For the third MMIO (store to EBP) ...

> To make this work, KVM would need to allow precisely resuming execution where
> it left off.  We can't use MMIO fragments, because unlike MMIO accesses that
> split pages, each memory load is an individual access.
> 

Since all MMIO reads are saved to the buffer (mc->data) and the index (mc->end)
is adjusted by every re-emulation, KVM already supports multiple MMIO reads.

> I don't see any reason to try to make this work.  It's a ridiculously convoluted
> scenario that, AFAIK, has no real world application.
> 

I agree POPA+MMIO is rare but supporting it to be same as hardware is cheap. We
can also use POPA to validate the multiple MMIO reads which has a complex flow (
actually, that's why I tried POPA in kvm-unit-test and then find this issue).

> > Signed-off-by: Tao Su <tao1.su@xxxxxxxxxxxxxxx>
> > ---
> > For testing, we can add POPA to the emulator case in kvm-unit-test.
> > ---
> >  arch/x86/kvm/emulate.c | 6 +++++-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> > index e72aed25d721..3746fef6ca60 100644
> > --- a/arch/x86/kvm/emulate.c
> > +++ b/arch/x86/kvm/emulate.c
> > @@ -1999,6 +1999,7 @@ static int em_pushf(struct x86_emulate_ctxt *ctxt)
> >  
> >  static int em_popa(struct x86_emulate_ctxt *ctxt)
> >  {
> > +	unsigned long old_esp = reg_read(ctxt, VCPU_REGS_RSP);
> >  	int rc = X86EMUL_CONTINUE;
> >  	int reg = VCPU_REGS_RDI;
> >  	u32 val = 0;
> > @@ -2010,8 +2011,11 @@ static int em_popa(struct x86_emulate_ctxt *ctxt)
> >  		}
> >  
> >  		rc = emulate_pop(ctxt, &val, ctxt->op_bytes);
> > -		if (rc != X86EMUL_CONTINUE)
> > +		if (rc != X86EMUL_CONTINUE) {
> > +			assign_register(reg_rmw(ctxt, VCPU_REGS_RSP),
> > +					old_esp, ctxt->op_bytes);
> >  			break;
> > +		}
> >  		assign_register(reg_rmw(ctxt, reg), val, ctxt->op_bytes);
> >  		--reg;
> >  	}
> > 
> > base-commit: 786c8248dbd33a5a7a07f7c6e55a7bfc68d2ca48
> > -- 
> > 2.34.1
> > 




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux