On Thu, 5 Dec 2019 20:01:42 +0100 Auger Eric <eric.auger@xxxxxxxxxx> wrote: Hi Eric, > Hi Marc, > > On 12/5/19 3:52 PM, Marc Zyngier wrote: > > On 2019-12-05 14:06, Auger Eric wrote: > >> Hi Marc, > >> > >> On 12/5/19 10:43 AM, Marc Zyngier wrote: > >>> Hi Eric, > >>> > >>> On 2019-12-04 20:44, Eric Auger wrote: > >>>> At the moment a SW_INCR counter always overflows on 32-bit > >>>> boundary, independently on whether the n+1th counter is > >>>> programmed as CHAIN. > >>>> > >>>> Check whether the SW_INCR counter is a 64b counter and if so, > >>>> implement the 64b logic. > >>>> > >>>> Fixes: 80f393a23be6 ("KVM: arm/arm64: Support chained PMU counters") > >>>> Signed-off-by: Eric Auger <eric.auger@xxxxxxxxxx> > >>>> --- > >>>> virt/kvm/arm/pmu.c | 16 +++++++++++++++- > >>>> 1 file changed, 15 insertions(+), 1 deletion(-) > >>>> > >>>> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c > >>>> index c3f8b059881e..7ab477db2f75 100644 > >>>> --- a/virt/kvm/arm/pmu.c > >>>> +++ b/virt/kvm/arm/pmu.c > >>>> @@ -491,6 +491,8 @@ void kvm_pmu_software_increment(struct kvm_vcpu > >>>> *vcpu, u64 val) > >>>> > >>>> enable = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0); > >>>> for (i = 0; i < ARMV8_PMU_CYCLE_IDX; i++) { > >>>> + bool chained = test_bit(i >> 1, vcpu->arch.pmu.chained); > >>>> + > >>> > >>> I'd rather you use kvm_pmu_pmc_is_chained() rather than open-coding > >>> this. But see below: > >>> > >>>> if (!(val & BIT(i))) > >>>> continue; > >>>> type = __vcpu_sys_reg(vcpu, PMEVTYPER0_EL0 + i) > >>>> @@ -500,8 +502,20 @@ void kvm_pmu_software_increment(struct kvm_vcpu > >>>> *vcpu, u64 val) > >>>> reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) + 1; > >>>> reg = lower_32_bits(reg); > >>>> __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) = reg; > >>>> - if (!reg) > >>>> + if (reg) /* no overflow */ > >>>> + continue; > >>>> + if (chained) { > >>>> + reg = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) + 1; > >>>> + reg = lower_32_bits(reg); > >>>> + __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i + 1) = reg; > >>>> + if (reg) > >>>> + continue; > >>>> + /* mark an overflow on high counter */ > >>>> + __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i + 1); > >>>> + } else { > >>>> + /* mark an overflow */ > >>>> __vcpu_sys_reg(vcpu, PMOVSSET_EL0) |= BIT(i); > >>>> + } > >>>> } > >>>> } > >>>> } > >>> > >>> I think the whole function is a bit of a mess, and could be better > >>> structured to treat 64bit counters as a first class citizen. > >>> > >>> I'm suggesting something along those lines, which tries to > >>> streamline things a bit and keep the flow uniform between the > >>> two word sizes. IMHO, it helps reasonning about it and gives > >>> scope to the ARMv8.5 full 64bit counters... It is of course > >>> completely untested. > >> > >> Looks OK to me as well. One remark though, don't we need to test if the > >> n+1th reg is enabled before incrementing it? > > > > Hmmm. I'm not sure. I think we should make sure that we don't flag > > a counter as being chained if the odd counter is disabled, rather > > than checking it here. As long as the odd counter is not chained > > *and* enabled, we shouldn't touch it.> > > Again, untested: > > > > diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c > > index cf371f643ade..47366817cd2a 100644 > > --- a/virt/kvm/arm/pmu.c > > +++ b/virt/kvm/arm/pmu.c > > @@ -15,6 +15,7 @@ > > #include <kvm/arm_vgic.h> > > > > static void kvm_pmu_create_perf_event(struct kvm_vcpu *vcpu, u64 > > select_idx); > > +static void kvm_pmu_update_pmc_chained(struct kvm_vcpu *vcpu, u64 > > select_idx); > > > > #define PERF_ATTR_CFG1_KVM_PMU_CHAINED 0x1 > > > > @@ -298,6 +299,7 @@ void kvm_pmu_enable_counter_mask(struct kvm_vcpu > > *vcpu, u64 val) > > * For high counters of chained events we must recreate the > > * perf event with the long (64bit) attribute set. > > */ > > + kvm_pmu_update_pmc_chained(vcpu, i); > > if (kvm_pmu_pmc_is_chained(pmc) && > > kvm_pmu_idx_is_high_counter(i)) { > > kvm_pmu_create_perf_event(vcpu, i); > > @@ -645,7 +647,8 @@ static void kvm_pmu_update_pmc_chained(struct > > kvm_vcpu *vcpu, u64 select_idx) > > struct kvm_pmu *pmu = &vcpu->arch.pmu; > > struct kvm_pmc *pmc = &pmu->pmc[select_idx]; > > > > - if (kvm_pmu_idx_has_chain_evtype(vcpu, pmc->idx)) { > > + if (kvm_pmu_idx_has_chain_evtype(vcpu, pmc->idx) && > > + kvm_pmu_counter_is_enabled(vcpu, pmc->idx)) { > > In create_perf_event(), has_chain_evtype() is used and a 64b sample > period would be chosen even if the counters are disjoined (since the odd > is disabled). We would need to use pmc_is_chained() instead. > > With perf_events, the check of whether the odd register is enabled is > properly done (create_perf_event). Then I understand whenever there is a > change in enable state or type we delete the previous perf event and > re-create a new one. Enable state check just is missing for SW_INCR. Can you please respin this? I'd like to have it queued quickly, if at all possible. > > Some other questions: > - do we need a perf event to be created even if the counter is not > enabled? For instance on counter resets, create_perf_events get called. It shouldn't be necessary. > - also actions are made for counters which are not implemented. loop > until ARMV8_PMU_MAX_COUNTERS. Do you think it is valuable to have a > bitmask of supported counters stored before pmu readiness? > I can propose such changes if you think they are valuable. That would certainly be a performance optimization. Thanks, M. -- Jazz is not dead. It just smells funny... _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/mailman/listinfo/kvmarm