Re: [PATCH v3 3/4] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Dec 03, 2018 at 03:04:14PM +0000, Will Deacon wrote:
> Hi Andrew,
> 
> On Thu, Nov 22, 2018 at 11:25:49AM +0000, Andrew Murray wrote:
> > Add support for the :G and :H attributes in perf by handling the
> > exclude_host/exclude_guest event attributes.
> > 
> > We notify KVM of counters that we wish to be enabled or disabled on
> > guest entry/exit and thus defer from starting or stopping :G events
> > as per the events exclude_host attribute.
> > 
> > With both VHE and non-VHE we switch the counters between host/guest
> > at EL2. We are able to eliminate counters counting host events on
> > the boundaries of guest entry/exit when using :G by filtering out
> > EL2 for exclude_host. However when using :H unless exclude_hv is set
> > on non-VHE then there is a small blackout window at the guest
> > entry/exit where host events are not captured.
> > 
> > Signed-off-by: Andrew Murray <andrew.murray@xxxxxxx>
> > ---
> >  arch/arm64/kernel/perf_event.c | 52 ++++++++++++++++++++++++++++++++++++------
> >  1 file changed, 45 insertions(+), 7 deletions(-)
> 
> I've got a cold at the moment, so it could just be that, but I struggled to
> follow the logic in this patch. Comments below.
> 
> > diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> > index de564ae..e0619ba 100644
> > --- a/arch/arm64/kernel/perf_event.c
> > +++ b/arch/arm64/kernel/perf_event.c
> > @@ -26,6 +26,7 @@
> >  
> >  #include <linux/acpi.h>
> >  #include <linux/clocksource.h>
> > +#include <linux/kvm_host.h>
> >  #include <linux/of.h>
> >  #include <linux/perf/arm_pmu.h>
> >  #include <linux/platform_device.h>
> > @@ -647,11 +648,23 @@ static inline int armv8pmu_enable_counter(int idx)
> >  
> >  static inline void armv8pmu_enable_event_counter(struct perf_event *event)
> >  {
> > +	struct perf_event_attr *attr = &event->attr;
> >  	int idx = event->hw.idx;
> > +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
> >  
> > -	armv8pmu_enable_counter(idx);
> >  	if (armv8pmu_event_is_chained(event))
> > -		armv8pmu_enable_counter(idx - 1);
> > +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> > +
> > +	if (attr->exclude_host)
> > +		kvm_clr_set_guest_pmu_events(0, counter_bits);
> > +	if (attr->exclude_guest)
> > +		kvm_clr_set_host_pmu_events(0, counter_bits);
> 
> Hmm, so if I have both exclude_host and exclude_guest set then we'll set
> the counter in both events_host_only and events_guest_only, which is weird
> and probably not what you want.

perf_event_attr is passed from userspace, whilst the perf tool doesn't allow
you to set exclude_host and exclude_guest at the same time it's feasiable for
another user to do so. So indeed the current behaviour wouldn't be expected.

> 
> I think I'd find it easier to understand if you dropped the "only" suffix on
> the kvm masks, and you actually made the logic positive so that an event
> that counts in both host and guest has its bit set in both of the masks.

No problem. This actually fixes a bug when multiple events are enabled.

> 
> > +
> > +	if (!attr->exclude_host) {
> > +		armv8pmu_enable_counter(idx);
> > +		if (armv8pmu_event_is_chained(event))
> > +			armv8pmu_enable_counter(idx - 1);
> > +	}
> >  }
> >  
> >  static inline int armv8pmu_disable_counter(int idx)
> > @@ -664,11 +677,23 @@ static inline int armv8pmu_disable_counter(int idx)
> >  static inline void armv8pmu_disable_event_counter(struct perf_event *event)
> >  {
> >  	struct hw_perf_event *hwc = &event->hw;
> > +	struct perf_event_attr *attr = &event->attr;
> >  	int idx = hwc->idx;
> > +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
> >  
> >  	if (armv8pmu_event_is_chained(event))
> > -		armv8pmu_disable_counter(idx - 1);
> > -	armv8pmu_disable_counter(idx);
> > +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> > +
> > +	if (attr->exclude_host)
> > +		kvm_clr_set_guest_pmu_events(counter_bits, 0);
> > +	if (attr->exclude_guest)
> > +		kvm_clr_set_host_pmu_events(counter_bits, 0);
> > +
> > +	if (!attr->exclude_host) {
> > +		if (armv8pmu_event_is_chained(event))
> > +			armv8pmu_disable_counter(idx - 1);
> > +		armv8pmu_disable_counter(idx);
> > +	}
> 
> It looks like you're relying on the perf_event_attr remaining unchanged
> between the enable/disable calls for a given event. However, I'm not
> convinced that's the case -- have you checked that e.g.
> PERF_EVENT_IOC_MODIFY_ATTRIBUTES can't rewrite the thing behind your
> back?

Even better - I can unconditionally clear the counter_bits here seeing as the
event is being disabled.

> 
> >  }
> >  
> >  static inline int armv8pmu_enable_intens(int idx)
> > @@ -943,16 +968,25 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
> >  	 * Therefore we ignore exclude_hv in this configuration, since
> >  	 * there's no hypervisor to sample anyway. This is consistent
> >  	 * with other architectures (x86 and Power).
> > +	 *
> > +	 * To eliminate counting host events on the boundaries of
> > +	 * guest entry/exit we ensure EL2 is not included in hyp mode
> > +	 * with !exclude_host.
> >  	 */
> >  	if (is_kernel_in_hyp_mode()) {
> > -		if (!attr->exclude_kernel)
> > +		if (!attr->exclude_kernel && !attr->exclude_host)
> >  			config_base |= ARMV8_PMU_INCLUDE_EL2;
> 
> Do you know what perf passes by default here? I'm worried that we're
> introducing a user-visible change to existing programs by changing this
> check.

When running 'perf xxx' exclude_guest is set and exclude_host is unset. And the
opposite when running 'perf kvm xxx'. So for existing users that do not use
virtualisation then there is no change.

> 
> >  	} else {
> > -		if (attr->exclude_kernel)
> > -			config_base |= ARMV8_PMU_EXCLUDE_EL1;
> >  		if (!attr->exclude_hv)
> >  			config_base |= ARMV8_PMU_INCLUDE_EL2;
> >  	}
> > +
> > +	/*
> > +	 * Filter out !VHE kernels and guest kernels
> > +	 */
> > +	if (attr->exclude_kernel)
> > +		config_base |= ARMV8_PMU_EXCLUDE_EL1;
> 
> Why have you moved this hunk?

When guests use perf the guest PMU is emulated by KVM (virt/kvm/arm/pmu) and is
backed by host perf events, these events always have the exclude_host flag set.
Therefore to filter these kernel events we need to filter EL1. By moving this
hunk we ensure guest kernel filtering works even when the host is VHE. (The
side effect is that we unnecessarily filter EL1 on VHE host kernel events, but
this has no effect as no-one is using this except any guest.)

Thanks,

Andrew Murray

> 
> Will
_______________________________________________
kvmarm mailing list
kvmarm@xxxxxxxxxxxxxxxxxxxxx
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm



[Index of Archives]     [Linux KVM]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux