Re: vmx_pmu_caps_test fails on Skylake based CPUS due to read only LBRs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 2024-10-28 at 08:55 -0700, Sean Christopherson wrote:
> On Fri, Oct 18, 2024, Maxim Levitsky wrote:
> > Hi,
> > 
> > Our CI found another issue, this time with vmx_pmu_caps_test.
> > 
> > On 'Intel(R) Xeon(R) Gold 6328HL CPU' I see that all LBR msrs (from/to and
> > TOS), are always read only - even when LBR is disabled - once I disable the
> > feature in DEBUG_CTL, all LBR msrs reset to 0, and you can't change their
> > value manually.  Freeze LBRS on PMI seems not to affect this behavior.
> > 
> > I don't know if this is how the hardware is supposed to work (Intel's manual
> > doesn't mention anything about this), or if it is something platform
> > specific, because this system also was found to have LBRs enabled
> > (IA32_DEBUGCTL.LBR == 1) after a fresh boot, as if BIOS left them enabled - I
> > don't have an idea on why.
> > 
> > The problem is that vmx_pmu_caps_test writes 0 to LBR_TOS via KVM_SET_MSRS,
> > and KVM actually passes this write to actual hardware msr (this is somewhat
> > wierd),
> 
> When the "virtual" LBR event is active in host perf, the LBR MSRs are passed
> through to the guest, and so KVM needs to propagate the guest values into hardware.

Yes, but usually KVM_SET_MSRS doesn't touch hardware directly, even for registers/msrs
that are passed through, but rather the relevant values are loaded when the guest vCPU
is loaded and/or when the guest is entered.
I don't know the details though.


> 
> > and since the MSR is not writable and silently drops writes instead,
> > once the test tries to read it, it gets some random value instead.
> 
> This just showed up in our testing too (delayed backport on our end).  I haven't
> (yet) tried debugging our setup, but is there any chance Intel PT is interfering?
> 
>   33.3.1.2 Model Specific Capability Restrictions
>   Some processor generations impose restrictions that prevent use of
>   LBRs/BTS/BTM/LERs when software has enabled tracing with Intel Processor Trace.
>   On these processors, when TraceEn is set, updates of LBR, BTS, BTM, LERs are
>   suspended but the states of the corresponding IA32_DEBUGCTL control fields
>   remained unchanged as if it were still enabled. When TraceEn is cleared, the
>   LBR array is reset, and LBR/BTS/BTM/LERs updates will resume.
>   Further, reads of these registers will return 0, and writes will be dropped.
> 
>   The list of MSRs whose updates/accesses are restricted follows.
>   
>     • MSR_LASTBRANCH_x_TO_IP, MSR_LASTBRANCH_x_FROM_IP, MSR_LBR_INFO_x, MSR_LASTBRANCH_TOS
>     • MSR_LER_FROM_LIP, MSR_LER_TO_LIP
>     • MSR_LBR_SELECT
>   
>   For processors with CPUID DisplayFamily_DisplayModel signatures of 06_3DH,
>   06_47H, 06_4EH, 06_4FH, 06_56H, and 06_5EH, the use of Intel PT and LBRs are
>   mutually exclusive.
> 
> If Intel PT is NOT responsible, i.e. the behavior really is due to DEBUG_CTL.LBR=0,
> then I don't see how KVM can sanely virtualize LBRs.
> 

Hi!


I will check PT influence soon, but to me it looks like the hardware implementation has changed. 
It is just too consistent:

When DEBUG_CTL.LBR=1, the LBRs do work, I see all the registers update, although
TOS does seem to be stuck at one value, but it does change sometimes, and it's non zero.

The FROM/TO do show healthy amount of updates 

Note that I read all msrs using 'rdmsr' userspace tool.

However as soon as I disable DEBUG_CTL.LBR, all these MSRs reset to 0, and can't be changed.

I'll check this on another Skylake based machine and see if I see the same thing.

Best regards,
	Maxim Levitsky





[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux