On Wed, Dec 12, 2018 at 09:07:42AM +0100, Christoffer Dall wrote: > On Tue, Dec 11, 2018 at 01:59:03PM +0000, Andrew Murray wrote: > > On Tue, Dec 11, 2018 at 10:06:53PM +1100, Michael Ellerman wrote: > > > [ Reviving old thread. ] > > > > > > Andrew Murray <andrew.murray@xxxxxxx> writes: > > > > On Tue, Nov 20, 2018 at 10:31:36PM +1100, Michael Ellerman wrote: > > > >> Andrew Murray <andrew.murray@xxxxxxx> writes: > > > >> > > > >> > Update design.txt to reflect the presence of the exclude_host > > > >> > and exclude_guest perf flags. > > > >> > > > > >> > Signed-off-by: Andrew Murray <andrew.murray@xxxxxxx> > > > >> > --- > > > >> > tools/perf/design.txt | 4 ++++ > > > >> > 1 file changed, 4 insertions(+) > > > >> > > > > >> > diff --git a/tools/perf/design.txt b/tools/perf/design.txt > > > >> > index a28dca2..7de7d83 100644 > > > >> > --- a/tools/perf/design.txt > > > >> > +++ b/tools/perf/design.txt > > > >> > @@ -222,6 +222,10 @@ The 'exclude_user', 'exclude_kernel' and 'exclude_hv' bits provide a > > > >> > way to request that counting of events be restricted to times when the > > > >> > CPU is in user, kernel and/or hypervisor mode. > > > >> > > > > >> > +Furthermore the 'exclude_host' and 'exclude_guest' bits provide a way > > > >> > +to request counting of events restricted to guest and host contexts when > > > >> > +using virtualisation. > > > >> > > > >> How does exclude_host differ from exclude_hv ? > > > > > > > > I believe exclude_host / exclude_guest are intented to distinguish > > > > between host and guest in the hosted hypervisor context (KVM). > > > > > > OK yeah, from the perf-list man page: > > > > > > u - user-space counting > > > k - kernel counting > > > h - hypervisor counting > > > I - non idle counting > > > G - guest counting (in KVM guests) > > > H - host counting (not in KVM guests) > > > > > > > Whereas exclude_hv allows to distinguish between guest and > > > > hypervisor in the bare-metal type hypervisors. > > > > > > Except that's exactly not how we use them on powerpc :) > > > > > > We use exclude_hv to exclude "the hypervisor", regardless of whether > > > it's KVM or PowerVM (which is a bare-metal hypervisor). > > > > > > We don't use exclude_host / exclude_guest at all, which I guess is a > > > bug, except I didn't know they existed until this thread. > > > > > > eg, in a KVM guest: > > > > > > $ perf record -e cycles:G /bin/bash -c "for i in {0..100000}; do :;done" > > > $ perf report -D | grep -Fc "dso: [hypervisor]" > > > 16 > > > > > > > > > > In the case of arm64 - if VHE extensions are present then the host > > > > kernel will run at a higher privilege to the guest kernel, in which > > > > case there is no distinction between hypervisor and host so we ignore > > > > exclude_hv. But where VHE extensions are not present then the host > > > > kernel runs at the same privilege level as the guest and we use a > > > > higher privilege level to switch between them - in this case we can > > > > use exclude_hv to discount that hypervisor role of switching between > > > > guests. > > > > > > I couldn't find any arm64 perf code using exclude_host/guest at all? > > > > Correct - but this is in flight as I am currently adding support for this > > see [1]. > > > > > > > > And I don't see any x86 code using exclude_hv. > > > > I can't find any either. > > > > > > > > But maybe that's OK, I just worry this is confusing for users. > > > > There is some extra context regarding this where exclude_guest/exclude_host > > was added, see [2] and where exclude_hv was added, see [3] > > > > Generally it seems that exclude_guest/exclude_host relies upon switching > > counters off/on on guest/host switch code (which works well in the nested > > virt case). Whereas exclude_hv tends to rely solely on hardware capability > > based on privilege level (which works well in the bare metal case where > > the guest doesn't run at same privilege as the host). > > > > I think from the user perspective exclude_hv allows you to see your overhead > > if you are a guest (i.e. work done by bare metal hypervisor associated with > > you as the guest). Whereas exclude_guest/exclude_host doesn't allow you to > > see events above you (i.e. the kernel hypervisor) if you are the guest... > > > > At least that's how I read this, I've copied in others that may provide > > more authoritative feedback. > > > > [1] https://lists.cs.columbia.edu/pipermail/kvmarm/2018-December/033698.html > > [2] https://www.spinics.net/lists/kvm/msg53996.html > > [3] https://lore.kernel.org/patchwork/patch/143918/ > > > > I'll try to answer this in a different way, based on previous > discussions with Joerg et al. who introduced these flags. Assume no > support for nested virtualization as a first approximation: > > If you are running as a guest: > - exclude_hv: stop counting events when the hypervisor runs > - exclude_host: has no effect > - exclude_guest: has no effect > > If you are running as a host/hypervisor: > - exclude_hv: has no effect > - exclude_host: only count events when the guest is running > - exclude_guest: only count events when the host is running > > With nested virtualization, you get the natural union of the above. > > **This has nothing to do with the design of the hypervisor such as the > ARM non-VHE KVM which splits its execution across EL1 and EL2 -- those > are both considered host from the point of view of Linux as a hypervisor > using KVM, and both considered hypervisor from the point of view of a > guest.** For clarity, this is what arm64 currently does (assuming no nesting and without the current version of this patchset): If you are running as a guest (VHE or !VHE host): - exclude_hv: has no effect for a KVM guest (filters hypervisor on !VHE bare metal hypervisor guest) - exclude_host: has no effect - exclude_guest: has no effect If you are running as a host/hypervisor: - exclude_hv: has no effect for VHE (filters EL2 on !VHE) - exclude_host: only count events when the guest is running - exclude_guest: only count events when the host is running Is this as expected? Thanks, Andrew Murray > > > Thanks, > > Christoffer