* Joerg Roedel <joro@xxxxxxxxxx> wrote: > On Fri, Feb 26, 2010 at 10:55:17AM +0800, Zhang, Yanmin wrote: > > On Thu, 2010-02-25 at 18:34 +0100, Joerg Roedel wrote: > > > On Thu, Feb 25, 2010 at 04:04:28PM +0100, Jes Sorensen wrote: > > > > > > > 1) Add support to perf to allow it to monitor a KVM guest from the > > > > host. > > > > > > This shouldn't be a big problem. The PMU of AMD Fam10 processors can be > > > configured to count only when in guest mode. Perf needs to be aware of > > > that and fetch the rip from a different place when monitoring a guest. > > > The idea is we want to measure both host and guest at the same time, and > > compare all the hot functions fairly. > > So you want to measure while the guest vcpu is running and the vmexit > path of that vcpu (including qemu userspace part) together? The > challenge here is to find out if a performance event originated in guest > mode or in host mode. > But we can check for that in the nmi-protected part of the vmexit path. As far as instrumentation goes, virtualization is simply another 'PID dimension' of measurement. Today we can isolate system performance measurements/events to the following domains: - per system - per cpu - per task ( Note that PowerPC already supports certain sorts of 'hypervisor/kernel/user' domain separation, and we have some ABI details for all that but it's by no means complete. Anton is using the PowerPC bits AFAIK, so it already works to a certain degree. ) When extending measurements to KVM, we want two things: - user friendliness: instead of having to check 'ps' and figure out which Qemu thread is the KVM thread we want to profile, just give a convenience namespace to access guest profiling info. -G ought to map to the first currently running KVM guest it can find. (which would match like 90% of the cases) - etc. No ifs and when. If 'perf kvm top' doesnt show something useful by default the whole effort is for naught. - Extend core facilities and enable the following measurement dimensions: host-kernel-space host-user-space guest-kernel-space guest-user-space on a per guest basis. We want to be able to measure just what the guest does, and we want to be able to measure just what the host does. Some of this the hardware helps us with (say only measuring host kernel events is possible), some has to be done by fiddling with event enable/disable at vm-exit / vm-entry time. My suggestion, as always, would be to start very simple and very minimal: Enable 'perf kvm top' to show guest overhead. Use the exact same kernel image both as a host and as guest (for testing), to not have to deal with the symbol space transport problem initially. Enable 'perf kvm record' to only record guest events by default. Etc. This alone will be a quite useful result already - and gives a basis for further work. No need to spend months to do the big grand design straight away, all of this can be done gradually and in the order of usefulness - and you'll always have something that actually works (and helps your other KVM projects) along the way. [ And, as so often, once you walk that path, that grand scheme you are thinking about right now might easily become last year's really bad idea ;-) ] So please start walking the path and experience the challenges first-hand. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html