RE: [PATCH v2] Documentation: KVM: Add vPMU implementaion and gap document

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> > +
> > +3. Arch PMU virtualization
> > +==========================
> > +
> > +3.1. Overview
> > +-------------
> > +
> > +Once KVM/QEMU expose vcpu's Arch PMU capability into guest, the guest
> > +PMU driver would access the Arch PMU MSRs (including Fixed and GP
> > +counter) as the host does. All the guest Arch PMU MSRs accessing are
> > +interceptable.
> > +
> > +When a guest virtual counter is enabled through guest MSR writing,
> > +the KVM trap will create a kvm perf event through the perf subsystem.
> > +The kvm perf event's attribute is gotten from the guest virtual
> > +counter's MSR setting.
> > +
> > +When a guest changes the virtual counter's setting later, the KVM
> > +trap will release the old kvm perf event then create a new kvm perf
> > +event with the new setting.
> > +
> > +When guest read the virtual counter's count number, the kvm trap will
> > +read kvm perf event's counter value and accumulate it to the previous
> > +counter value.
> > +
> > +When guest no longer access the virtual counter's MSR within a
> > +scheduling time slice and the virtual counter is disabled, KVM will
> > +release the kvm perf event.
> > +
> > +  ----------------------------
> > +  |  Guest                   |
> > +  |  perf subsystem          |
> > +  ----------------------------
> > +       |            ^
> > +  vMSR |            | vPMI
> > +       v            |
> > +  ----------------------------
> > +  |  vPMU        KVM vCPU    |
> > +  ----------------------------
> > +        |          ^
> > +  Call  |          | Callbacks
> > +        v          |
> > +  ---------------------------
> > +  | Host Linux Kernel       |
> > +  | perf subsystem          |
> > +  ---------------------------
> > +               |       ^
> > +           MSR |       | PMI
> > +               v       |
> > +         --------------------
> > +	 | PMU        CPU   |
> > +         --------------------
> > +
> > +Each guest virtual counter has a corresponding kvm perf event, and
> > +the kvm perf event joins host perf scheduler and complies with host
> > +perf scheduler rule. When kvm perf event is scheduled by host perf
> > +scheduler and is active, the guest virtual counter could supply the correct
> value.
> > +However, if another host perf event comes in and takes over the kvm
> > +perf event resource, the kvm perf event will be inactive, then the
> > +virtual counter keeps the saved value when the kvm perf event is
> > +preempted. But guest perf doesn't notice the underbeach virtual
> > +counter is stopped, so the final guest profiling data is wrong.
> > +
> > +3.2. Host and Guest perf event contention
> > +-----------------------------------------
> > +
> > +Kvm perf event is a per-process pinned event, its priority is second.
> > +When kvm perf event is active, it can be preempted by host per-cpu
> > +pinned perf event, or it can preempt host flexible perf events. Such
> > +preemption can be temporarily prohibited through disabling host IRQ.
> > +
> > +The following results are expected when host and guest perf event
> > +coexist according to perf scheduler rule:
> > +1). if host per cpu pinned events occupy all the HW resource, kvm
> > +perf event can not be active as no available resource, the virtual
> > +counter value is  zero always when the guest reads it.
> > +2). if host per cpu pinned event release HW resource, and kvm perf
> > +event is inactive, kvm perf event can claim the HW resource and
> > +switch into active, then the guest can get the correct value from the
> > +guest virtual counter during kvm perf event is active, but the guest
> > +total counter value is not correct since counter value is lost during
> > +kvm perf event is inactive.
> > +3). if kvm perf event is active, then host per cpu pinned perf event
> > +becomes active and reclaims kvm perf event resource, kvm perf event
> > +will be inactive. Finally the virtual counter value is kept unchanged
> > +and stores previous saved value when the guest reads it. So the guest
> > +total counter isn't correct.
> > +4). If host flexible perf events occupy all the HW resource, kvm perf
> > +event can be active and preempts host flexible perf event resource,
> > +the guest can get the correct value from the guest virtual counter.
> > +5). if kvm perf event is active, then other host flexible perf events
> > +request to active, kvm perf event still own the resource and active,
> > +so the guest can get the correct value from the guest virtual counter.
> > +
> > +3.3. vPMU Arch Gaps
> > +-------------------
> > +
> > +The coexist of host and guest perf events has gap:
> > +1). when guest accesses PMU MSRs at the first time, KVM will trap it
> > +and create kvm perf event, but this event may be inactive because the
> 
> inactive? It seems the event should enter error state base on previous
> description?
[Zhang, Xiong Y] yes, the code will put it into error state. I used inactive instead of error in all the vPMU section, my intention is to make thing easier to understand if reader is not familiar with perf scheduler, and implies this state is recoverable. I should change it to error state since perf scheduler section has introduced this.

thanks





[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux