Hi Oliver, Marc, Thank you for the review! On Fri, Jan 20, 2023 at 10:05 AM Oliver Upton <oliver.upton@xxxxxxxxx> wrote: > > Hey Marc, > > On Fri, Jan 20, 2023 at 12:12:32PM +0000, Marc Zyngier wrote: > > On Fri, 20 Jan 2023 00:30:33 +0000, Oliver Upton <oliver.upton@xxxxxxxxx> wrote: > > > I think we need to derive a sanitised value for PMCR_EL0.N, as I believe > > > nothing in the architecture prevents implementers from gluing together > > > cores with varying numbers of PMCs. We probably haven't noticed it yet > > > since it would appear all Arm designs have had 6 PMCs. > > > > This brings back the question of late onlining. How do you cope with > > with the onlining of such a CPU that has a smaller set of counters > > than its online counterparts? This is at odds with the way the PMU > > code works. > > You're absolutely right, any illusion we derived from the online set of > CPUs could fall apart with a late onlining of a different core. > > > If you have a different set of counters, you are likely to have a > > different PMU altogether: > > > > [ 1.192606] hw perfevents: enabled with armv8_cortex_a57 PMU driver, 7 counters available > > [ 1.201254] hw perfevents: enabled with armv8_cortex_a53 PMU driver, 7 counters available > > > > This isn't a broken system, but it has two set of cores which are > > massively different, and two PMUs. > > > > This really should tie back to the PMU type we're counting on, and to > > the set of CPUs that implements it. We already have some > > infrastructure to check for the affinity of the PMU vs the CPU we're > > running on, and this is already visible to userspace. > > > > Can't we just leave this responsibility to userspace? > > Believe me, I'm always a fan of offloading things to userspace :) > > If the VMM is privy to the details of the system it is on then the > differing PMUs can be passed through to the guest w/ pinned vCPU > threads. I just worry about the case of a naive VMM that assumes a > homogenous system. I don't think I could entirely blame the VMM in this > case either as we've gone to lengths to sanitise the feature set > exposed to userspace. > > What happens when a vCPU gets scheduled on a core where the vPMU > doesn't match? Ignoring other incongruences, it is not possible to > virtualize more counters than are supported by the vPMU of the core. I believe KVM_RUN will fail with KVM_EXIT_FAIL_ENTRY (Please see the code that handles ON_UNSUPPORTED_CPU). > Stopping short of any major hacks in the kernel to fudge around the > problem, I believe we may need to provide better documentation of how > heterogeneous CPUs are handled in KVM and what userspace can do about > it. Documentation/virt/kvm/devices/vcpu.rstDocumentation/virt/kvm/devices/vcpu.rst for KVM_ARM_VCPU_PMU_V3_SET_PMU has some description for the current behavior at least. (perhaps we may need to update documents for this though) Now I'm a bit worried about the validation code for PMCR_EL0.N as well, as setting (restoring) PMCR_EL0 could be done on any pCPUs (even before using KVM_ARM_VCPU_PMU_V3_SET_PMU). What I am currently looking at is something like this: - Set the sanitised (min) value of PMCR_EL0.N among all PMUs for vCPUs by default. - Validate the PMCR_EL0.N value that userspace tries to set against the max value on the system (this is to ensure that restoring PMCR_EL0 for a vCPU works on any pCPUs) - Make KVM_RUN fail when PMCR_EL0.N for the vCPU indicates more counters than the PMU that is set for the vCPU. What do you think ? Thank you, Reiji