Please; don't send malformed emails like this. Lines wrap at 78 chars. On Mon, Nov 05, 2018 at 03:37:24PM +0000, Wang, Wei W wrote: > On Monday, November 5, 2018 8:14 PM, Peter Zijlstra wrote: > > That can only work if the host counter has perf_event_attr::exclude_guest=1, > > any counter without that must also count when the guest is running. > > > > (and, IIRC, normal perf tool events do not have that set by default) > > Probably no. Please see Line 81 at > https://github.com/torvalds/linux/blob/master/tools/perf/util/util.c > perf_guest by default is false, which makes "attr->exclude_guest = 1" Then you're in luck. But if the host creates an even that has exclude_guest=0 set, it should still work. > > The thing is; you cannot do blind pass-through of the PMU, some of its > > features simply do not work in a guest. Also, the host perf driver expects > > certain functionality that must be respected. > > Actually we are not blindly assigning the perf counters. Guest works > with its own complete perf stack (like the one on the host) which also > has its own constraints. But it knows nothing of the host state. > The counter is also not passed through to the guest, guest accesses to > the assigned counter will still exit to the hypervisor, and the > hypervisor helps update the counter. Yes, you have to; because the PMU doesn't properly virtualize, also because the HV -- linux in our case -- already claimed the PMU. So the network passthrough case you mentioned simply doesn't apply at all. Don't bother looking at it for inspiration. > > Those are the constraints you have to work with. > > > > Back when we all started down this virt rathole, I proposed people do > > paravirt perf, where events would be handed to the host kernel and let the > > host kernel do its normal thing. But people wanted to do the MSR based > > thing because of !linux guests. > > IMHO, it is worthwhile to care more about the real use case. When a > user gets a virtual machine from a vendor, all he can do is to run > perf inside the guest. The above contention concerns would not happen, > because the user wouldn't be able to come to the host to run perf on > the virtualization software (e.g. ./perf qemu..) and in the meantime > running perf in the guest to cause the contention. That's your job. Mine is to make sure that whatever you propose fits in the existing model and doesn't make a giant mess of things. And for Linux guests on Linux hosts, paravirt perf still makes the most sense to me; then you get the host scheduling all the events and providing the guest with the proper counts/runtimes/state. > On the other hand, when we improve the user experience of running perf > inside the guest by reducing the virtualization overhead, that would > bring real benefits to the real use case. You can start to improve things by doing a less stupid implementation of the existing code.