Hi Reiji, Catching up on this. On Tue, 23 Aug 2022 05:30:21 +0100, Reiji Watanabe <reijiw@xxxxxxxxxx> wrote: > > Hi Marc, > > On Fri, Aug 5, 2022 at 6:58 AM Marc Zyngier <maz@xxxxxxxxxx> wrote: > > > > Ricardo recently pointed out that the PMU chained counter emulation > > in KVM wasn't quite behaving like the one on actual hardware, in > > the sense that a chained counter would expose an overflow on > > both halves of a chained counter, while KVM would only expose the > > overflow on the top half. > > > > The difference is subtle, but significant. What does the architecture > > say (DDI0087 H.a): > > > > - Before PMUv3p4, all counters but the cycle counter are 32bit > > - A 32bit counter that overflows generates a CHAIN event on the > > adjacent counter after exposing its own overflow status > > - The CHAIN event is accounted if the counter is correctly > > configured (CHAIN event selected and counter enabled) > > > > This all means that our current implementation (which uses 64bit > > perf events) prevents us from emulating this overflow on the lower half. > > > > How to fix this? By implementing the above, to the letter. > > > > This largly results in code deletion, removing the notions of > > "counter pair", "chained counters", and "canonical counter". > > The code is further restructured to make the CHAIN handling similar > > to SWINC, as the two are now extremely similar in behaviour. > > > > Reported-by: Ricardo Koller <ricarkol@xxxxxxxxxx> > > Signed-off-by: Marc Zyngier <maz@xxxxxxxxxx> > > --- > > arch/arm64/kvm/pmu-emul.c | 324 +++++++++++--------------------------- > > include/kvm/arm_pmu.h | 2 - > > 2 files changed, 91 insertions(+), 235 deletions(-) > > > > diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c > > index 11c43bed5f97..4986e8b3ea6c 100644 > > --- a/arch/arm64/kvm/pmu-emul.c > > +++ b/arch/arm64/kvm/pmu-emul.c [...] > > +/* > > + * Perform an increment on any of the counters described in @mask, > > + * generating the overflow if required, and propagate it as a chained > > + * event if possible. > > + */ > > +static void kvm_pmu_counter_increment(struct kvm_vcpu *vcpu, > > + unsigned long mask, u32 event) > > +{ > > + int i; > > + > > + if (!kvm_vcpu_has_pmu(vcpu)) > > + return; > > + > > + if (!(__vcpu_sys_reg(vcpu, PMCR_EL0) & ARMV8_PMU_PMCR_E)) > > + return; > > + > > + /* Weed out disabled counters */ > > + mask &= __vcpu_sys_reg(vcpu, PMCNTENSET_EL0); > > + > > + for_each_set_bit(i, &mask, ARMV8_PMU_CYCLE_IDX) { > > + u64 type, reg; > > + > > + /* Filter on event type */ > > + type = __vcpu_sys_reg(vcpu, PMEVTYPER0_EL0 + i); > > + type &= kvm_pmu_event_mask(vcpu->kvm); > > + if (type != event) > > + continue; > > + > > + /* Increment this counter */ > > + reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) + 1; > > + reg = lower_32_bits(reg); > > + __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) = reg; > > + > > + if (reg) /* No overflow? move on */ > > + continue; > > + > > + /* Mark overflow */ > > + __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i); > > Perhaps it might be useful to create another helper that takes > care of just one counter (it would essentially do the code above > in the loop). The helper could be used (in addition to the above > loop) from the code below for the CHAIN event case and from > kvm_pmu_perf_overflow(). Then unnecessary execution of > for_each_set_bit() could be avoided for these two cases. I'm not sure it really helps. We would still need to check whether the counter is enabled, and we'd need to bring that into the helper instead of keeping it outside of the loop. [...] > > @@ -625,30 +528,27 @@ static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 select_idx) > > { > > struct arm_pmu *arm_pmu = vcpu->kvm->arch.arm_pmu; > > struct kvm_pmu *pmu = &vcpu->arch.pmu; > > - struct kvm_pmc *pmc; > > + struct kvm_pmc *pmc = &pmu->pmc[select_idx]; > > struct perf_event *event; > > struct perf_event_attr attr; > > u64 eventsel, counter, reg, data; > > > > - /* > > - * For chained counters the event type and filtering attributes are > > - * obtained from the low/even counter. We also use this counter to > > - * determine if the event is enabled/disabled. > > - */ > > - pmc = kvm_pmu_get_canonical_pmc(&pmu->pmc[select_idx]); > > - > > - reg = (pmc->idx == ARMV8_PMU_CYCLE_IDX) > > + reg = (select_idx == ARMV8_PMU_CYCLE_IDX) > > ? PMCCFILTR_EL0 : PMEVTYPER0_EL0 + pmc->idx; > > You may want to use select_idx instead of pmc->id for consistency ? Yes. Although Oliver had a point in saying that these pmc->idx vs select_idx conversions were not strictly necessary and cluttered the patch. [...] > > @@ -752,11 +607,15 @@ static void kvm_pmu_update_pmc_chained(struct kvm_vcpu *vcpu, u64 select_idx) > > void kvm_pmu_set_counter_event_type(struct kvm_vcpu *vcpu, u64 data, > > u64 select_idx) > > { > > + struct kvm_pmu *pmu = &vcpu->arch.pmu; > > + struct kvm_pmc *pmc = &pmu->pmc[select_idx]; > > u64 reg, mask; > > > > if (!kvm_vcpu_has_pmu(vcpu)) > > return; > > > > + kvm_pmu_stop_counter(vcpu, pmc); > > It appears that kvm_pmu_stop_counter() doesn't have to be called here > because it is called in the beginning of kvm_pmu_create_perf_event(). It feels a bit odd to change the event type without stopping the counter first, but I can't see anything going wrong if we omit it. I'll drop it. Thanks, M. -- Without deviation from the norm, progress is not possible. _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/mailman/listinfo/kvmarm