Re: [PATCH v10 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes

Christoffer Dall <christoffer.dall@xxxxxxx> · Wed, 6 Mar 2019 09:42:59 +0100

On Tue, Mar 05, 2019 at 11:45:52AM +0000, Andrew Murray wrote:
> On Mon, Mar 04, 2019 at 11:14:24AM +0000, Andrew Murray wrote:
> > On Tue, Feb 26, 2019 at 01:44:59PM +0100, Christoffer Dall wrote:
> > > On Wed, Feb 20, 2019 at 04:15:40PM +0000, Andrew Murray wrote:
> > > > On Mon, Feb 18, 2019 at 10:53:07PM +0100, Christoffer Dall wrote:
> > > > > On Mon, Jan 14, 2019 at 04:11:47PM +0000, Andrew Murray wrote:
> > > > > > Add support for the :G and :H attributes in perf by handling the
> > > > > > exclude_host/exclude_guest event attributes.
> > > > > > 
> > > > > > We notify KVM of counters that we wish to be enabled or disabled on
> > > > > > guest entry/exit and thus defer from starting or stopping :G events
> > > > > > as per the events exclude_host attribute.
> > > > > > 
> > > > > > With both VHE and non-VHE we switch the counters between host/guest
> > > > > > at EL2. We are able to eliminate counters counting host events on
> > > > > > the boundaries of guest entry/exit when using :G by filtering out
> > > > > > EL2 for exclude_host. However when using :H unless exclude_hv is set
> > > > > > on non-VHE then there is a small blackout window at the guest
> > > > > > entry/exit where host events are not captured.
> > > > > > 
> > > > > > Signed-off-by: Andrew Murray <andrew.murray@xxxxxxx>
> > > > > > Reviewed-by: Suzuki K Poulose <suzuki.poulose@xxxxxxx>
> > > > > > ---
> 
> > > > > 
> > > > > Let me see if I get this right:
> > > > > 
> > > > > 	exclude_user:	VHE: Don't count EL0
> > > > > 			Non-VHE: Don't count EL0
> > > > > 
> > > > > 	exclude_kernel: VHE: Don't count EL2 and don't count EL1
> > > > > 			Non-VHE: Don't count EL1
> > > > > 
> > > > > 	exclude_hv:	VHE: No effect
> > > > > 			Non-VHE: Don't count EL2
> > > > > 
> > > > > 	exclude_host:	VHE: Don't count EL2 + enable/disable on guest entry/exit
> > > > > 			Non-VHE: disable on guest entry/disable on guest entry/exit
> > > > > 
> > > > > And the logic I extract is that _user applies across both guest and
> > > > > host, as does _kernel (regardless of the mode the kernel on the current
> > > > > system runs in, might be only EL1, might be EL1 and EL2), and _hv is
> > > > > specific to non-VHE systems to measure events in a specific piece of KVM
> > > > > code that runs at EL2.
> > > > > 
> > > > > As I expressed before, that doesn't seem to be the intent behind the
> > > > > exclude_hv flag, but I'm not sure how other architectures actually
> > > > > implement things today, and even if it's a curiosity of the Arm
> > > > > architecture and has value to non-VHE hypervisor hackers, and we don't
> > > > > really have to care about uniformity with the other architectures, then
> > > > > fine.
> > > > > 
> > > > > It has taken me a while to make sense of this code change, so I really
> > > > > wish we can find a suitable place to document the semantics clearly for
> > > > > perf users on arm64.
> > > > > 
> > > > > Now, another thing comes to mind:  Do we really need to enable and
> > > > > disable anything on a VHE system on entry/exit to/from a guest?  Can we
> > > > > instead do the following:
> > > > > 
> > > > > 	exclude_host:	Disable EL2 counting
> > > > > 		     	Disable EL0 counting
> > > > > 		     	Enable EL0 counting on vcpu_load
> > > > > 		     	  (unless exclude_user is also set)
> > > > > 		     	Disable EL0 counting on vcpu_put
> > > > > 
> > > > > 	exclude_guest:	Disable EL1 counting
> > > > > 		      	Disable EL0 counting on vcpu_load
> > > > > 		      	Enable EL0 counting on vcpu_put
> > > > > 			  (unless exclude_user is also set)
> > > > > 
> > > > > If that works, we can avoid the overhead in the critical path on VHE
> > > > > systems and actually have slightly more accurate counting, leaving the
> > > > > entry/exit operations to be specific to non-VHE.
> > > > 
> > > > This all makes sense.
> > > > 
> > > > At present on VHE, for host only events, there is a small blackout window
> > > > at the guest entry/exit - this is where we turn off/on host counting before
> > > > entering/exiting the guest. (This blackout window also exists on !VHE unless
> > > > exclude_hv is set).
> > > > 
> > > > To mitigate against this the PMU switching code was brought as close to the
> > > > guest entry/exit as possible - but as you point out at this point in
> > > > kvm_arch_vcpu_ioctl_run we're running without interrupts/preemption and so
> > > > on the critical path. I believe we add about 11 instructions when there are
> > > > no PMU counters enabled and about 23 when they are enabled. I suppose it
> > > > would be possible to use static keys to reduce the overhead when counters
> > > > are not enabled...
> > > > 
> > > > Your suggestion provides an optimal solution, however it adds some
> > > > complexity - here is a rough attempt at implementing it...
> > > > 
> 
> > > > 
> > > > Do you think this is worth developing further?
> > > > 
> > > 
> > > Yes, I think it's worth going to fairly great lengths to keep the
> > > critical path on VHE systems lean, and I suspect if we take the easy way
> > > to add functionality first, it will only be harder to factor things out
> > > later.
> 
> It looks like we can also apply this same approach on !VHE. For the sole
> use-case where a user wishes to count only for a host or guest EL0, then we
> switch the counter at EL1 in vcpu_load instead of the switch code.
> 

ok, even better then.  Looking forward to seeing the patches.

Thanks,

    Christoffer
_______________________________________________
kvmarm mailing list
kvmarm@xxxxxxxxxxxxxxxxxxxxx
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm