Re: [PATCH 2/4] KVM: nVMX: Check GUEST_DR7 on vmentry of nested guests

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Aug 30, 2019 at 4:15 PM Jim Mattson <jmattson@xxxxxxxxxx> wrote:
>
> On Fri, Aug 30, 2019 at 4:07 PM Krish Sadhukhan
> <krish.sadhukhan@xxxxxxxxxx> wrote:
> >
> >
> >
> > On 08/29/2019 03:26 PM, Jim Mattson wrote:
> > > On Thu, Aug 29, 2019 at 2:25 PM Krish Sadhukhan
> > > <krish.sadhukhan@xxxxxxxxxx> wrote:
> > >> According to section "Checks on Guest Control Registers, Debug Registers, and
> > >> and MSRs" in Intel SDM vol 3C, the following checks are performed on vmentry
> > >> of nested guests:
> > >>
> > >>      If the "load debug controls" VM-entry control is 1, bits 63:32 in the DR7
> > >>      field must be 0.
> > > Can't we just let the hardware check guest DR7? This results in
> > > "VM-entry failure due to invalid guest state," right? And we just
> > > reflect that to L1?
> >
> > Just trying to understand the reason why this particular check can be
> > deferred to the hardware.
>
> The vmcs02 field has the same value as the vmcs12 field, and the
> physical CPU has the same requirements as the virtual CPU.

Actually, you're right. There is a problem. With the current
implementation, there's a priority inversion if the vmcs12 contains
both illegal guest state for which the checks are deferred to
hardware, and illegal entries in the VM-entry MSR-load area. In this
case, we will synthesize a "VM-entry failure due to MSR loading"
rather than a "VM-entry failure due to invalid guest state."

There are so many checks on guest state that it's really compelling to
defer as many as possible to hardware. However, we need to fix the
aforesaid priority inversion. Instead of returning early from
nested_vmx_enter_non_root_mode() with EXIT_REASON_MSR_LOAD_FAIL, we
could induce a "VM-entry failure due to MSR loading" for the next
VM-entry of vmcs02 and continue with the attempted vmcs02 VM-entry. If
hardware exits with EXIT_REASON_INVALID_STATE, we reflect that to L1,
and if hardware exits with EXIT_REASON_INVALID_STATE, we reflect that
to L1 (along with the appropriate exit qualification).

The tricky part is in undoing the successful MSR writes if we reflect
EXIT_REASON_INVALID_STATE to L1. Some MSR writes can't actually be
undone (e.g. writes to IA32_PRED_CMD), but maybe we can get away with
those. (Fortunately, it's illegal to put x2APIC MSRs in the VM-entry
MSR-load area!) Other MSR writes are just a bit tricky to undo (e.g.
writes to IA32_TIME_STAMP_COUNTER).

Alternatively, we could perform validity checks on the entire vmcs12
VM-entry MSR-load area before writing any of the MSRs. This may be
easier, but it would certainly be slower. We would have to be wary of
situations where processing an earlier entry affects the validity of a
later entry. (If we take this route, then we would also have to
process the valid prefix of the VM-entry MSR-load area when we reflect
EXIT_REASON_MSR_LOAD_FAIL to L1.)

Note that this approach could be extended to permit the deferral of
some control field checks to hardware as well. As long as the control
field is copied verbatim from vmcs12 to vmcs02 and the virtual CPU
enforces the same constraints as the physical CPU, deferral should be
fine. We just have to make sure that we induce a "VM-entry failure due
to invalid guest state" for the next VM-entry of vmcs02 if any
software checks on guest state fail, rather than immediately
synthesizing an "VM-entry failure due to invalid guest state" during
the construction of vmcs02.



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux