On Sun, Jan 31, 2021 at 9:17 PM Like Xu <like.xu@xxxxxxxxxxxxxxx> wrote: > > Hi geniuses, > > Please help review this new version which enables the guest LBR. > > We already upstreamed the guest LBR support in the host perf, please > check more details in each commit and feel free to test and comment. > > QEMU part: https://lore.kernel.org/qemu-devel/20210201045453.240258-1-like.xu@xxxxxxxxxxxxxxx > kvm-unit-tests: https://lore.kernel.org/kvm/20210201045751.243231-1-like.xu@xxxxxxxxxxxxxxx > > v13-v14 Changelog: > - Rewrite crud about vcpu->arch.perf_capabilities; > - Add PERF_CAPABILITIES testcases to tools/testing/selftests/kvm; > - Add basic LBR testcases to the kvm-unit-tests (w/ QEMU patches); > - Apply rewritten commit log from Paolo; > - Queued the first patch "KVM: x86: Move common set/get handler ..."; > - Rename 'already_passthrough' to 'msr_passthrough'; > - Check the values of MSR_IA32_PERF_CAPABILITIES early; > - Call kvm_x86_ops.pmu_ops->cleanup() always and drop extra_cleanup; > - Use INTEL_PMC_IDX_FIXED_VLBR directly; > - Fix a bug in the vmx_get_perf_capabilities(); > > Previous: > https://lore.kernel.org/kvm/20210108013704.134985-1-like.xu@xxxxxxxxxxxxxxx/ > > --- > > The last branch recording (LBR) is a performance monitor unit (PMU) > feature on Intel processors that records a running trace of the most > recent branches taken by the processor in the LBR stack. This patch > series is going to enable this feature for plenty of KVM guests. > > with this patch set, the following error will be gone forever and cloud > developers can better understand their programs with less profiling overhead: > > $ perf record -b lbr ${WORKLOAD} > or $ perf record --call-graph lbr ${WORKLOAD} > Error: > cycles: PMU Hardware doesn't support sampling/overflow-interrupts. Try 'perf stat' > > The user space could configure whether it's enabled or not for each > guest via MSR_IA32_PERF_CAPABILITIES msr. As a first step, a guest > could only enable LBR feature if its cpu model is the same as the > host since the LBR feature is still one of model specific features. > > If it's enabled on the guest, the guest LBR driver would accesses the > LBR MSR (including IA32_DEBUGCTLMSR and records MSRs) as host does. > The first guest access on the LBR related MSRs is always interceptible. > The KVM trap would create a special LBR event (called guest LBR event) > which enables the callstack mode and none of hardware counter is assigned. > The host perf would enable and schedule this event as usual. > > Guest's first access to a LBR registers gets trapped to KVM, which > creates a guest LBR perf event. It's a regular LBR perf event which gets > the LBR facility assigned from the perf subsystem. Once that succeeds, > the LBR stack msrs are passed through to the guest for efficient accesses. > However, if another host LBR event comes in and takes over the LBR > facility, the LBR msrs will be made interceptible, and guest following > accesses to the LBR msrs will be trapped and meaningless. > > Because saving/restoring tens of LBR MSRs (e.g. 32 LBR stack entries) in > VMX transition brings too excessive overhead to frequent vmx transition > itself, the guest LBR event would help save/restore the LBR stack msrs > during the context switching with the help of native LBR event callstack > mechanism, including LBR_SELECT msr. > > If the guest no longer accesses the LBR-related MSRs within a scheduling > time slice and the LBR enable bit is unset, vPMU would release its guest > LBR event as a normal event of a unused vPMC and the pass-through > state of the LBR stack msrs would be canceled. How does live migration work? I don't see any mechanism for recording the current LBR MSRs on suspend or restoring them on resume.