On Wed, Jun 22, 2022, Tetsuo Handa wrote: > On 2018/03/28 16:29, Wanpeng Li wrote: > >> syzbot dashboard link: > >> https://syzkaller.appspot.com/bug?extid=760a73552f47a8cd0fd9 > >> > > Maybe the same as this one. https://lkml.org/lkml/2018/3/21/174 Paolo, > > any idea against my analysis? > > No progress for 4 years. Did somebody check Wanpeng's analysis ? The most recent failure is a different bug, the splat Wanpeng debugged requires unrestricted guest to be disabled, whereas this does not. Somewhat of a side topic, if the old bug still exists (the syzkaller reproducer fails with invalid guest state, so it's not clear whether or not the bug is still a problem), I suspect this hack-a-fix would handle the Real Mode injection case: diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 735543df829a..58801d3888c8 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -8209,7 +8209,7 @@ void kvm_inject_realmode_interrupt(struct kvm_vcpu *vcpu, int irq, int inc_eip) ctxt->_eip = ctxt->eip + inc_eip; ret = emulate_int_real(ctxt, irq); - if (ret != X86EMUL_CONTINUE) { + if (ret != X86EMUL_CONTINUE || vcpu->mmio_needed) { kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu); } else { ctxt->eip = ctxt->_eip; If I ever have time and/or get bored, I'll try to repro the realmode bug unless someone beats me to it. > Since I'm not familiar with KVM, my questions from different direction... > > > > syzbot is hitting WARN_ON(vcpu->arch.pio.count || vcpu->mmio_needed) added by > commit 716d51abff06f484 ("KVM: Provide userspace IO exit completion callback") > due to vcpu->mmio_needed == true. > > Question 1: what is the intent of checking for vcpu->mmio_needed == false? It's a sanity check to detect KVM bugs. If vcpu->mmio_needed is true, KVM needs to exit to userspace to complete the MMIO operation. On that exit to userspace, KVM is supposed to also set a callback to essentially acknowledge that the MMIO completed. The issue in this bug is that after setting vcpu->mmio_needed, KVM detects and injects an exception. Because of how KVM handles MMIO, unlike MMIO reads, MMIO writes don't immediately stop emulation. While odd, it should work because MMIO writes shouldn't be processed until after all fault checks have passed. The underlying bug is that LTR emulation has incorrect ordering and checks for a non-canonical base _after_ marking the TSS as busy (which triggers MMIO). So as much as I want to suppress this type of warn by clearing vcpu->mmio_needed when injecting an exception, I suspect playing whack-a-mole is the right approach because all those moles are likely bugs :-( Though one thing we can do is change the WARN_ON() to a WARN_ON_ONCE() so that kernels outside of panic_on_warn=1 won't blow up on a buggy/malicious userspace. diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index 39ea9138224c..09e4b67b881f 100644 --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -1699,16 +1699,6 @@ static int __load_segment_descriptor(struct x86_emulate_ctxt *ctxt, case VCPU_SREG_TR: if (seg_desc.s || (seg_desc.type != 1 && seg_desc.type != 9)) goto exception; - if (!seg_desc.p) { - err_vec = NP_VECTOR; - goto exception; - } - old_desc = seg_desc; - seg_desc.type |= 2; /* busy */ - ret = ctxt->ops->cmpxchg_emulated(ctxt, desc_addr, &old_desc, &seg_desc, - sizeof(seg_desc), &ctxt->exception); - if (ret != X86EMUL_CONTINUE) - return ret; break; case VCPU_SREG_LDTR: if (seg_desc.s || seg_desc.type != 2) @@ -1749,6 +1739,15 @@ static int __load_segment_descriptor(struct x86_emulate_ctxt *ctxt, ((u64)base3 << 32), ctxt)) return emulate_gp(ctxt, 0); } + + if (seg == VCPU_SREG_TR) { + old_desc = seg_desc; + seg_desc.type |= 2; /* busy */ + ret = ctxt->ops->cmpxchg_emulated(ctxt, desc_addr, &old_desc, &seg_desc, + sizeof(seg_desc), &ctxt->exception); + if (ret != X86EMUL_CONTINUE) + return ret; + } load: ctxt->ops->set_segment(ctxt, selector, &seg_desc, base3, seg); if (desc) > If we run a reproducer provided by syzbot, we can observe that mutex_unlock(&vcpu->mutex) > in kvm_vcpu_ioctl() is called with vcpu->mmio_needed == true. > > Question 2: Is kvm_vcpu_ioctl() supposed to leave with vcpu->mmio_needed == false? > In other words, is doing > > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -4104,6 +4104,7 @@ static long kvm_vcpu_ioctl(struct file *filp, > r = kvm_arch_vcpu_ioctl(filp, ioctl, arg); > } > out: > + WARN_ON_ONCE(vcpu->mmio_needed); > mutex_unlock(&vcpu->mutex); > kfree(fpu); > kfree(kvm_sregs); > > appropriate? It's not appropriate, mmio_needed is actually supposed to be accompanied by a exit from kvm_vcpu_ioctl() to userspace.