Re: KVM PMU virtualization

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



* Joerg Roedel <joro@xxxxxxxxxx> wrote:

> On Fri, Feb 26, 2010 at 10:55:17AM +0800, Zhang, Yanmin wrote:
> > On Thu, 2010-02-25 at 18:34 +0100, Joerg Roedel wrote:
> > > On Thu, Feb 25, 2010 at 04:04:28PM +0100, Jes Sorensen wrote:
> > > 
> > > > 1) Add support to perf to allow it to monitor a KVM guest from the
> > > >    host.
> > > 
> > > This shouldn't be a big problem. The PMU of AMD Fam10 processors can be
> > > configured to count only when in guest mode. Perf needs to be aware of
> > > that and fetch the rip from a different place when monitoring a guest.
> 
> > The idea is we want to measure both host and guest at the same time, and
> > compare all the hot functions fairly.
> 
> So you want to measure while the guest vcpu is running and the vmexit
> path of that vcpu (including qemu userspace part) together? The
> challenge here is to find out if a performance event originated in guest
> mode or in host mode.
> But we can check for that in the nmi-protected part of the vmexit path.

As far as instrumentation goes, virtualization is simply another 'PID 
dimension' of measurement.

Today we can isolate system performance measurements/events to the following 
domains:

 - per system
 - per cpu
 - per task

( Note that PowerPC already supports certain sorts of 'hypervisor/kernel/user' 
  domain separation, and we have some ABI details for all that but it's by no 
  means complete. Anton is using the PowerPC bits AFAIK, so it already works 
  to a certain degree. )

When extending measurements to KVM, we want two things:

 - user friendliness: instead of having to check 'ps' and figure out which 
   Qemu thread is the KVM thread we want to profile, just give a convenience
   namespace to access guest profiling info. -G ought to map to the first
   currently running KVM guest it can find. (which would match like 90% of the
   cases) - etc. No ifs and when. If 'perf kvm top' doesnt show something 
   useful by default the whole effort is for naught.

 - Extend core facilities and enable the following measurement dimensions:

     host-kernel-space
     host-user-space
     guest-kernel-space
     guest-user-space

   on a per guest basis. We want to be able to measure just what the guest 
   does, and we want to be able to measure just what the host does.

   Some of this the hardware helps us with (say only measuring host kernel 
   events is possible), some has to be done by fiddling with event 
   enable/disable at vm-exit / vm-entry time.

My suggestion, as always, would be to start very simple and very minimal:

Enable 'perf kvm top' to show guest overhead. Use the exact same kernel image 
both as a host and as guest (for testing), to not have to deal with the symbol 
space transport problem initially. Enable 'perf kvm record' to only record 
guest events by default. Etc.

This alone will be a quite useful result already - and gives a basis for 
further work. No need to spend months to do the big grand design straight 
away, all of this can be done gradually and in the order of usefulness - and 
you'll always have something that actually works (and helps your other KVM 
projects) along the way.

[ And, as so often, once you walk that path, that grand scheme you are 
  thinking about right now might easily become last year's really bad idea ;-) ]

So please start walking the path and experience the challenges first-hand.

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux