Re: Regression in nested SVM on 4.16 (bisected)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2018-06-09 07:48, Liran Alon wrote:
> So I think the path to how to further bisect bug is very clear here: 
> 1) First, attempt to change rsm_interception() to pass
> EMULTYPE_NO_REEXECUTE and see if it makes a difference. (BTW, you can
> submit a commit that adds this EMULTYPE_NO_REEXECUTE as it should be
> present here) 2) If that doesn't work, attempt to remove
> rsm_ins_bytes and instead pass NULL. If this works, this means that
> there are cases which raise RSM interception on bytes different than 
> "\x0f\xaa".

Neither of those help.

> Anyway, having a look at> # echo 1 >/sys/kernel/debug/tracing/events/kvm/enable
> # cat /sys/kernel/debug/tracing/trace_pipe
> Should help debug the issue in case you discover this patch wasn't
> the root-cause.

Tracing further, it seems the issue is that L0 is completely unaware of
when L2 enters SMM mode. It doesn't know about the right SMBASE, it
doesn't know about whether L2 is in SMM or not. The emulation delivers a
#UD and then L2 triggers a shutdown:

d..1 27652.855246: kvm_entry: vcpu 4
.... 27652.855248: kvm_exit: reason rsm rip 0xfd399 info 0 0
.... 27652.855248: kvm_nested_vmexit: rip: 0x00000000000fd399 reason:
rsm ext_inf1: 0x0000000000000000 ext_inf2: 0x0000000000000000 ext_int:
0x00000000 ext_int_err: 0x00000000
.... 27652.855254: kvm_emulate_insn: 0:fd399:0f aa (prot32)
.... 27652.855257: kvm_inj_exception: #UD (0x0)
d..1 27652.855258: kvm_entry: vcpu 4
.... 27652.855259: kvm_exit: reason shutdown rip 0xfd399 info 0 0
.... 27652.855259: kvm_nested_vmexit: rip: 0x00000000000fd399 reason:
shutdown ext_inf1: 0x0000000000000000 ext_inf2: 0x0000000000000000
ext_int: 0x80000b08 ext_int_err: 0x00000000
.... 27652.855259: kvm_nested_vmexit_inject: reason: shutdown ext_inf1:
0x0000000000000000 ext_inf2: 0x0000000000000000 ext_int: 0x80000b08
ext_int_err: 0x00000000
d..1 27652.855260: kvm_entry: vcpu 4

L0 has no idea that L2 is in SMM (when L1 boots I do see SMM
entries/exits and correctly emulated rsm intercepts).

But without the bad commit I get this:

.... 12724.894359: kvm_exit: reason UD excp rip 0xfd399 info 0 0
.... 12724.894359: kvm_nested_vmexit: rip: 0x00000000000fd399 reason: UD
excp ext_inf1: 0x0000000000000000 ext_inf2: 0x0000000000000000 ext_int:
0x00000000 ext_int_err: 0x00000000
.... 12724.894359: kvm_nested_vmexit_inject: reason: UD excp ext_inf1:
0x0000000000000000 ext_inf2: 0x0000000000000000 ext_int: 0x00000000
ext_int_err: 0x00000000

So it's still a #UD. But it seems the problem here is that when the RSM
handler triggers a #UD because it thinks the guest isn't in SMM mode
(which is fine if that were delivered to L1, since L1 knows how to
handle it), it gets delivered straight to L2. Without the RSM intercept,
the #UD triggers a nested vmexit, and things work.

I tried following the exception injection path but I'm a bit lost.
inject_pending_event calls into kvm_x86_ops->check_nested_events, which
I think is supposed to turn some events into nested vmexits, but is only
implemented for VMX, not SVM. Then kvm_x86_ops->queue_exception gets
called with vcpu->arch.exception.injected = true. svm_queue_exception
has a path to nested_svm_check_exception, but only when injected ==
false. Even if I get rid of that check, nested_svm_check_exception calls
nested_svm_intercept which returns NESTED_EXIT_HOST, and that goes
nowhere again.

-- 
Hector Martin "marcan" (marcan@xxxxxxxxx)
Public Key: https://mrcn.st/pub



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux