On Fri, Oct 18, 2024, Maxim Levitsky wrote: > Hi, > > Our CI found another issue, this time with vmx_pmu_caps_test. > > On 'Intel(R) Xeon(R) Gold 6328HL CPU' I see that all LBR msrs (from/to and > TOS), are always read only - even when LBR is disabled - once I disable the > feature in DEBUG_CTL, all LBR msrs reset to 0, and you can't change their > value manually. Freeze LBRS on PMI seems not to affect this behavior. > > I don't know if this is how the hardware is supposed to work (Intel's manual > doesn't mention anything about this), or if it is something platform > specific, because this system also was found to have LBRs enabled > (IA32_DEBUGCTL.LBR == 1) after a fresh boot, as if BIOS left them enabled - I > don't have an idea on why. > > The problem is that vmx_pmu_caps_test writes 0 to LBR_TOS via KVM_SET_MSRS, > and KVM actually passes this write to actual hardware msr (this is somewhat > wierd), When the "virtual" LBR event is active in host perf, the LBR MSRs are passed through to the guest, and so KVM needs to propagate the guest values into hardware. > and since the MSR is not writable and silently drops writes instead, > once the test tries to read it, it gets some random value instead. This just showed up in our testing too (delayed backport on our end). I haven't (yet) tried debugging our setup, but is there any chance Intel PT is interfering? 33.3.1.2 Model Specific Capability Restrictions Some processor generations impose restrictions that prevent use of LBRs/BTS/BTM/LERs when software has enabled tracing with Intel Processor Trace. On these processors, when TraceEn is set, updates of LBR, BTS, BTM, LERs are suspended but the states of the corresponding IA32_DEBUGCTL control fields remained unchanged as if it were still enabled. When TraceEn is cleared, the LBR array is reset, and LBR/BTS/BTM/LERs updates will resume. Further, reads of these registers will return 0, and writes will be dropped. The list of MSRs whose updates/accesses are restricted follows. • MSR_LASTBRANCH_x_TO_IP, MSR_LASTBRANCH_x_FROM_IP, MSR_LBR_INFO_x, MSR_LASTBRANCH_TOS • MSR_LER_FROM_LIP, MSR_LER_TO_LIP • MSR_LBR_SELECT For processors with CPUID DisplayFamily_DisplayModel signatures of 06_3DH, 06_47H, 06_4EH, 06_4FH, 06_56H, and 06_5EH, the use of Intel PT and LBRs are mutually exclusive. If Intel PT is NOT responsible, i.e. the behavior really is due to DEBUG_CTL.LBR=0, then I don't see how KVM can sanely virtualize LBRs.