Hi all, On 5/10/2023 6:05 am, Sean Christopherson wrote:
So I'll add a self-NAK to the idea of completely disabling the host PMU, I think that would burn us quite badly at some point.
I seem to have missed a party, so allow me to add a few more comments to better facilitate future discussions in this direction: (1) PMU counters on TEE The SGX/SEV is already part of the upstream, but what kind of performance data will be obtained by sampling enclaves or sev-guest with hardware pmu counters on host (will the perf-report show these data missing holes or pure encrypted data), we don't have a clear idea nor have we established the right expectations. But on AMD profiling a SEV-SNP guest is supported: "Fingerprinting attack protection is also not supported in the current generation of these technologies. Fingerprinting attacks attempt to determine what code the VM is running by monitoring its access patterns, performance counter information, etc." (AMD SEV-SNP White Paper, 2020) (2) PMU Guest/Host Co-existence Development The introduction of pt_mode in the KVM was misleading, leading subsequent developers to believe that static slicing of pmu facility usage was allowed. On user scenarios, the host/perf should treat pmu resource requests from vCPUs with regularity (which can be unequal under the host's authority IMO) while allowing the host to be able to profile any software entity (including hypervisor and guest-code, including TEE code in debug mode). Functionality takes precedence over performance. The semantics of exclude_guest/host should be tied to the hw-event isolation settings on the hardware interfaces, not to the human-defined sw-context. The perf subsystem is the arbiter of pmu resource allocation on the host, and any attempt to change the status quo (or maintenance scope) will not succeed. Therefore, vPMU developers are required to be familiar with the implementation details of both perf and kvm, and try not to add perf APIs dedicated to serving KVM blindly. Getting host and guests to share limited PMU resources harmoniously is not particularly difficult compared to real rocket science in the kernel, so please don't be intimidated. (3) Performance Concern in Co-existence I wonder if it would be possible to add a knob to turn off the perf counter multiplexing mechanism on the host, so that in coexistence scenarios, the number of VM exits on the vCPU would not be increased by counter rotations due to timer expiration. For normal counters shared between guest and host, the number of counter msr switches requiring a vm-entry level will be relatively small. (The number of counters is growing; for LBR, it is possible to share LBR select values to avoid frequent switching, but of course this requires the implementation of a software filtering mechanism when the host/guest read the LBR records, and some additional PMI; for DS-based PEBS, host and guest PEBS buffers are automatically segregated based on linear address). There is a lot of room for optimisation here, and in real scenarios where triggering a large number of register switches in the host/guest PMU is to be expected and observed easily (accompanied by a large number of pmi appearances). If we are really worried about the virtualisation overhead of vPMU, then virtio-pmu might be an option. In this technology direction, the back-end pmu can add more performance events of interest to the VM (including host un-core and off-core events, host-side software events, etc.) In terms of implementation, the semantics of the MSRLIST instruction can be re-used, along with compatibility with the different PMU hardware interfaces on ARM and Risc-v, which is also very friendly to production environments based on its virtio nature. (4) New vPMU Feature Development We should not put KVM's current vPMU support into maintenance-only mode. Users want more PMU features in the guest, like AMD vIBS, Intel pmu higher versions, Intel topdown and Arch lbr, more on the way. The maturity of different features' patch sets aren't the same, but we can't ignore these real needs because of available time for key maintainers, apathy towards contributors, mindset avoidance and laziness, and preference for certain technology stacks. These technical challenges will attract an influx of open source heroes to push the technology forward, which is good in the long run. (5) More to think about Similar to the guest PMU feature, the debugging feature may face the same state. For example, what happens when you debug code inside the host and guest at the same time (host debugs hypevisor/guest code and guest debugs guest code only) ? Forgive my ignorance and offence, but we don't want to see a KVM subsystem controlled and driven by Google's demands. Please feel free to share comments to move forward. Thanks, Like Xu