On Tue, Aug 15, 2023 at 07:32:40AM +0100, Marc Zyngier wrote: > On Mon, 14 Aug 2023 08:16:27 +0100, > Leo Yan <leo.yan@xxxxxxxxxx> wrote: > > > > On Fri, Aug 11, 2023 at 07:05:20PM +0100, Marc Zyngier wrote: > > > Huang Shijie reports that, when profiling a guest from the host > > > with a number of events that exceeds the number of available > > > counters, the reported counts are wildly inaccurate. Without > > > the counter oversubscription, the reported counts are correct. > > > > > > Their investigation indicates that upon counter rotation (which > > > takes place on the back of a timer interrupt), we fail to > > > re-apply the guest EL0 enabling, leading to the counting of host > > > events instead of guest events. > > > > Seems to me, it's not clear for why the counter rotation will cause > > the issue. > > Maybe unclear to you, but rather clear to me (and most people else on > Cc). I have to admit this it true. > > In the example shared by Shijie in [1], the cycle counter is enabled > > for both host and guest > > No. You're misreading the example. We're profiling the guest from the > host, and the guest has no PMU access. > > > and cycle counter is a dedicated event > > which does not share counter with other events. Even there have > > counter rotation, it should not impact the cycle counter. > > Who says that we're counting cycles using the cycle counter? This is > an event like any other, and it can be counted on any counter. Sorry for noise. > > I mean if we cannot explain clearly for this part, we don't find the > > root cause, and this patch (and Shijie's patch) just walks around the > > issue. > > We have the root cause. You just need to think a bit harder. Let me elaborate a bit more for my concern. The question is how we can know the exactly the host and the guest have the different counter enabling? Shijie's patch relies on perf event rotation to trigger syncing for PMU PMEVTYPER and PMCCFILTR registers. The perf event rotation will enable and disable some events, but it doesn't mean the host and the guest enable different counters. If we use the perf event rotation to trigger syncing, there must introduce redundant operations. In your patch, it resyncs the PMU registers in the function armv8pmu_start(), this function is invoked not only when start PMU event, it also is invoked in PMU interrupt handler (see armv8pmu_handle_irq()), this also will lead to redundant syncing if we use the perf record command for PMU event sampling: perf record -e cycles:G,cycles:H -d -d -- sleep 10 This is why I think we should trigger the syncing in the function kvm_set_pmu_events(), where we can know exactly the event mismatching between the host and the guest. At the beginning it has checked the difference between the host and the guest by calling kvm_pmu_switch_needed(attr), thus we don't need to add more condition checking and directly call kvm_vcpu_pmu_resync_el0(). diff --git a/arch/arm64/kvm/pmu.c b/arch/arm64/kvm/pmu.c index 121f1a14c829..99adcdbb6a5d 100644 --- a/arch/arm64/kvm/pmu.c +++ b/arch/arm64/kvm/pmu.c @@ -46,6 +46,12 @@ void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr) pmu->events_host |= set; if (!attr->exclude_guest) pmu->events_guest |= set; + + /* + * The host and the guest enable different events for EL0, + * resync it. + */ + kvm_vcpu_pmu_resync_el0(); } Thanks, Leo