On 7/15/20 4:12 PM, Jim Mattson wrote:
On Wed, Jul 15, 2020 at 3:40 PM Krish Sadhukhan
<krish.sadhukhan@xxxxxxxxxx> wrote:
On 7/15/20 3:27 PM, Nadav Amit wrote:
On Jul 15, 2020, at 3:21 PM, Krish Sadhukhan <krish.sadhukhan@xxxxxxxxxx> wrote:
On 7/13/20 4:30 PM, Nadav Amit wrote:
On Jul 13, 2020, at 4:17 PM, Krish Sadhukhan <krish.sadhukhan@xxxxxxxxxx> wrote:
[snip]
I am just saying that the APM language "should be cleared to 0" is misleading if the processor doesn't enforce it.
Just to ensure I am clear - I am not blaming you in any way. I also found
the phrasing confusing.
Having said that, if you (or anyone else) reintroduces “positive” tests, in
which the VM CR3 is modified to ensure VM-entry succeeds when the reserved
non-MBZ bits are set, please ensure the tests fails gracefully. The
non-long-mode CR3 tests crashed since the VM page-tables were incompatible
with the paging mode.
In other words, instead of setting a VMMCALL instruction in the VM to trap
immediately after entry, consider clearing the present-bits in the high
levels of the NPT; or injecting some exception that would trigger exit
during vectoring or something like that.
P.S.: If it wasn’t clear, I am not going to fix KVM itself for some obvious
reasons.
I think since the APM is not clear, re-adding any test that tests those bits, is like adding a test with "undefined behavior" to me.
Paolo, Should I send a KVM patch to remove checks for those non-MBZ reserved bits ?
Which non-MBZ reserved bits (other than those that I addressed) do you refer
to?
I am referring to,
"[PATCH 2/3 v4] KVM: nSVM: Check that MBZ bits in CR3 and CR4 are
not set on vmrun of nested guests"
in which I added the following:
+#define MSR_CR3_LEGACY_RESERVED_MASK 0xfe7U
+#define MSR_CR3_LEGACY_PAE_RESERVED_MASK 0x7U
+#define MSR_CR3_LONG_RESERVED_MASK 0xfff0000000000fe7U
In my experience, the APM generally distinguishes between "reserved"
and "reserved, MBZ." The low bits you have indicated for CR3 are
marked only as "reserved" in Figures 3-4, 3-5, and 3-6 of the APM,
volume 2. Only bits 63:52 are marked as "reserved, MBZ." (In fact,
Figure 3-6 of the May 2020 version of the APM, revision 3.35, also
calls out bits 11:0 as the PCID when CR4.PCIDE is set.)
Of course, you could always test the behavior. :-)
I did some experiments on the processor behavior on an Epyc 2 system via
KVM:
1. MBZ bits: VMRUN passes even if these bits are set to 1 and guest
is exiting with exit code of SVM_EXIT_VMMCALL. According to
the APM, this settting should constitute an invalid guest state and
hence I should get and exit code of SVM_EXIT_ERR. There's no KVM check
in place for these CR3 bits, so the check is all done in hardware.
2. non-MBZ reserved bits: Based on Nadav Amit's suggestion, I set
the 'not present' bit in an upper level NPT in order to trigger an NPF
and I did get an exit code of SVM_EXIT_NPF when I set any of these bits.
I am hoping that the processor has done the consistency check before it
tripped on NPF and not the other way around, so that our test is useful :
In PAE-legacy and non-PAE-legacy modes, the guest doesn't exit with
SVM_EXIT_VMMCALL when these bits are set to 0. I am not sure if I am
missing any special setting for the PAE-legacy and non-PAE-legacy modes.
In long-mode, however, the processor seems to behave as per APM, i.e.,
guest exits with SVM_EXIT_VMMCALL when these bits are set to 0.