A polite ping for comments on this version, thanks! On Tuesday, August 6, 2019 3:16 PM, Wei Wang wrote: > Last Branch Recording (LBR) is a performance monitor unit (PMU) feature on > Intel CPUs that captures branch related info. This patch series enables this > feature to KVM guests. > > Each guest can be configured to expose this LBR feature to the guest via > userspace setting the enabling param in KVM_CAP_X86_GUEST_LBR (patch > 3). > > About the lbr emulation method: > Since the vcpu get scheduled in, the lbr related msrs are made interceptible. > This makes guest first access to a lbr related msr always vm-exit to kvm, so > that kvm can know whether the lbr feature is used during the vcpu time slice. > The kvm lbr msr handler does the following > things: > - create an lbr perf event (task pinned) for the vcpu thread. > The perf event mainly serves 2 purposes: > -- follow the host perf scheduling rules to manage the vcpu's usage > of lbr (e.g. a cpu pinned lbr event could reclaim lbr and thus > stopping the vcpu's use); > -- have the host perf do context switching of the lbr state on the > vcpu thread switching. > - pass the lbr related msrs through to the guest. > This enables the following guest accesses to the lbr related msrs > without vm-exit, as long as the vcpu's lbr event owns the lbr feature. > A cpu pinned lbr event on the host could come and take over the lbr > feature via IPI calls. In this case, the pass-through will be > cancelled (patch 13), and the guest following accesses to the lbr msrs > will vm-exit to kvm and accesses will be forbidden in the handler. > > If the guest doesn't touch any of the lbr related msrs (likely the guest doesn't > need to run lbr in the near future), the vcpu's lbr perf event will be freed > (please see patch 12 commit for more details). > > * Tests > Conclusion: the profiling results on the guest are similar to that on the host. > > Run: ./perf -b ./test_program > > - Test on the host: > Overhead Command Source Shared Object Source Symbol Target > Symbol > 22.35% ftest libc-2.23.so [.] __random [.] > __random > 8.20% ftest ftest [.] qux [.] qux > 5.88% ftest ftest [.] random@plt [.] > __random > 5.88% ftest libc-2.23.so [.] __random [.] > __random_r > 5.79% ftest ftest [.] main [.] > random@plt > 5.60% ftest ftest [.] main [.] foo > 5.24% ftest libc-2.23.so [.] __random [.] main > 5.20% ftest libc-2.23.so [.] __random_r [.] > __random > 5.00% ftest ftest [.] foo [.] qux > 4.91% ftest ftest [.] main [.] bar > 4.83% ftest ftest [.] bar [.] qux > 4.57% ftest ftest [.] main [.] main > 4.38% ftest ftest [.] foo [.] main > 4.13% ftest ftest [.] qux [.] foo > 3.89% ftest ftest [.] qux [.] bar > 3.86% ftest ftest [.] bar [.] main > > - Test on the guest: > Overhead Command Source Shaged Object Source Symbol Target > Symbol > 22.36% ftest libc-2.23.so [.] random [.] random > 8.55% ftest ftest [.] qux [.] qux > 5.79% ftest libc-2.23.so [.] random [.] > random_r > 5.64% ftest ftest [.] random@plt [.] > random > 5.58% ftest ftest [.] main [.] > random@plt > 5.55% ftest ftest [.] main [.] foo > 5.41% ftest libc-2.23.so [.] random [.] main > 5.31% ftest libc-2.23.so [.] random_r [.] random > 5.11% ftest ftest [.] foo [.] qux > 4.93% ftest ftest [.] main [.] main > 4.59% ftest ftest [.] qux [.] bar > 4.49% ftest ftest [.] bar [.] main > 4.42% ftest ftest [.] bar [.] qux > 4.16% ftest ftest [.] main [.] bar > 3.95% ftest ftest [.] qux [.] foo > 3.79% ftest ftest [.] foo [.] main > (due to the lib version difference, "random" is equavlent to __random above) > > v7->v8 Changelog: > - Patch 3: > -- document KVM_CAP_X86_GUEST_LBR in api.txt > -- make the check of KVM_CAP_X86_GUEST_LBR return the size of > struct x86_perf_lbr_stack, to let userspace do a compatibility > check. > - Patch 7: > -- support perf scheduler to not assign a counter for the perf event > that has PERF_EV_CAP_NO_COUNTER set (rather than skipping the > perf > scheduler). This allows the scheduler to detect lbr usage conflicts > via get_event_constraints, and lower priority events will finally > fail to use lbr. > -- define X86_PMC_IDX_NA as "-1", which represents a never assigned > counter id. There are other places that use "-1", but could be > updated to use the new macro in another patch series. > - Patch 8: > -- move the event->owner assignment into perf_event_alloc to have it > set before event_init is called. Please see this patch's commit for > reasons. > - Patch 9: > -- use "exclude_host" and "is_kernel_event" to decide if the lbr event > is used for the vcpu lbr emulation, which doesn't need a counter, > and removes the usage of the previous new perf_event_create API. > -- remove the unused attr fields. > - Patch 10: > -- set a hardware reserved bit (bit 62 of LBR_SELECT) to reg->config > for the vcpu lbr emulation event. This makes the config different > from other host lbr event, so that they don't share the lbr. > Please see the comments in the patch for the reasons why they > shouldn't share. > - Patch 12: > -- disable interrupt and check if the vcpu lbr event owns the lbr > feature before kvm writing to the lbr related msr. This avoids kvm > updating the lbr msrs after lbr has been reclaimed by other events > via ipi. > -- remove arch v4 related support. > - Patch 13: > -- double check if the vcpu lbr event owns the lbr feature before > vm-entry into the guest. The lbr pass-through will be cancelled if > lbr feature has been reclaimed by a cpu pinned lbr event. > > Previous: > https://lkml.kernel.org/r/1562548999-37095-1-git-send-email-wei.w.wang > @intel.com > > Wei Wang (14): > perf/x86: fix the variable type of the lbr msrs > perf/x86: add a function to get the addresses of the lbr stack msrs > KVM/x86: KVM_CAP_X86_GUEST_LBR > KVM/x86: intel_pmu_lbr_enable > KVM/x86/vPMU: tweak kvm_pmu_get_msr > KVM/x86: expose MSR_IA32_PERF_CAPABILITIES to the guest > perf/x86: support to create a perf event without counter allocation > perf/core: set the event->owner before event_init > KVM/x86/vPMU: APIs to create/free lbr perf event for a vcpu thread > perf/x86/lbr: don't share lbr for the vcpu usage case > perf/x86: save/restore LBR_SELECT on vcpu switching > KVM/x86/lbr: lbr emulation > KVM/x86/vPMU: check the lbr feature before entering guest > KVM/x86: remove the common handling of the debugctl msr > > Documentation/virt/kvm/api.txt | 26 +++ > arch/x86/events/core.c | 36 ++- > arch/x86/events/intel/core.c | 3 + > arch/x86/events/intel/lbr.c | 95 +++++++- > arch/x86/events/perf_event.h | 6 +- > arch/x86/include/asm/kvm_host.h | 5 + > arch/x86/include/asm/perf_event.h | 17 ++ > arch/x86/kvm/cpuid.c | 2 +- > arch/x86/kvm/pmu.c | 24 +- > arch/x86/kvm/pmu.h | 11 +- > arch/x86/kvm/pmu_amd.c | 7 +- > arch/x86/kvm/vmx/pmu_intel.c | 476 > +++++++++++++++++++++++++++++++++++++- > arch/x86/kvm/vmx/vmx.c | 4 +- > arch/x86/kvm/vmx/vmx.h | 2 + > arch/x86/kvm/x86.c | 47 ++-- > include/linux/perf_event.h | 18 ++ > include/uapi/linux/kvm.h | 1 + > kernel/events/core.c | 19 +- > 18 files changed, 738 insertions(+), 61 deletions(-) > > -- > 2.7.4