On 9/10/2021 9:59 AM, Xiaoyao Li wrote:
On 9/10/2021 5:41 AM, Sean Christopherson wrote:
On Fri, Aug 27, 2021, Xiaoyao Li wrote:
CPUID 0xD leaves reports the capabilities of Intel PT, e.g. it decides
which bits are valid to be set in MSR_IA32_RTIT_CTL, and reports the
number of PT ADDR ranges.
KVM needs to check that guest CPUID values set by userspace doesn't
enable any bit which is not supported by bare metal. Otherwise,
1. it will trigger vm-entry failure if hardware unsupported bit is
exposed to guest and set by guest.
2. it triggers #GP when context switch PT MSRs if exposing more
RTIT_ADDR* MSRs than hardware capacity.
Signed-off-by: Xiaoyao Li <xiaoyao.li@xxxxxxxxx>
..
+ * pt_desc.ctl_bitmask in later update_intel_pt_cfg().
+ *
+ * pt_desc.ctl_bitmask decides the legal value for guest
+ * MSR_IA32_RTIT_CTL. KVM cannot support PT capabilities beyond
native,
+ * otherwise it will trigger vm-entry failure if guest sets native
+ * unsupported bits in MSR_IA32_RTIT_CTL.
+ */
+ best = cpuid_entry2_find(entries, nent, 0xD, 0);
+ if (best) {
+ cpuid_count(0xD, 0, &eax, &ebx, &ecx, &edx);
+ if (best->ebx & ~ebx || best->ecx & ~ecx)
+ return -EINVAL;
+ }
+ best = cpuid_entry2_find(entries, nent, 0xD, 1);
+ if (best) {
+ cpuid_count(0xD, 0, &eax, &ebx, &ecx, &edx);
+ if (((best->eax & 0x7) > (eax & 0x7)) ||
Ugh, looking at the rest of the code, even this isn't sufficient because
pt_desc.guest.addr_{a,b} are hardcoded at 4 entries, i.e. running KVM
on hardware
with >4 entries will lead to buffer overflows.
it's hardcoded to 4 because there is a note of "no processors support
more than 4 address ranges" in SDM vol.3 Chapter 31.3.1, table 31-11
One option would be to bump that to the theoretical max of 15, which
doesn't seem
too horrible, especially if pt_desc as a whole is allocated on-demand,
which it
probably should be since it isn't exactly tiny (nor ubiquitous)
A different option would be to let userspace define whatever it wants
for guest
CPUID, and instead cap nr_addr_ranges at min(host.cpuid, guest.cpuid,
RTIT_ADDR_RANGE).
Letting userspace generate a bad MSR_IA32_RTIT_CTL is not problematic,
there are
plenty of ways userspace can deliberately trigger VM-Entry failure due
to invalid
guest state (even if this is a VM-Fail condition, it's not a danger to
KVM).
I'm fine to only safe guard the nr_addr_range if VM-Entry failure
doesn't matter.
Hi Sean.
It seems I misread your comment. All above you were talking about the
check on nr_addr_range. Did you want to say the check is not necessary
if it's to avoid VM-entry failure?
The problem is 1) the check on nr_addr_range is to avoid MSR read #GP,
thought kernel will fix the #GP. It still prints the warning message.
2) Other check of this Patch on guest CPUID 0x14 is to avoid VM-entry
failure.
So I want to ask that do you think both 1) and 2) are unnecessary, or
only 2) ?