On 2019/3/24 7:15, Andi Kleen wrote:
We optimize the current vPMU to work in this manner:
(1) rely on the existing host perf (perf_event_create_kernel_counter)
to allocate counters for in-use vPMC and always try to reuse events;
(2) vPMU captures guest accesses to the eventsel and fixctrl msr directly
to the hardware msr that the corresponding host event is scheduled on
and avoid pollution from host is also needed in its partial runtime;
If you do pass-through; how do you deal with event constraints?
The guest has to deal with them. It already needs to know
the model number to program the right events, can as well know
the constraints too.
For architectural events that don't need the model number it's
not a problem because they don't have constraints.
-Andi
I agree this version doesn't seem to keep an eye on host perf event
constraints deliberately:
1. Based on my limited knowledge, assuming the model number means hwc->idx.
2. The guest event constraints would be constructed into
hwc->config_base value which is pmc->eventsel and pmu->fixed_ctr_ctrl
from KVM point of view.
3. The guest PMU has same semantic model on virt hardware limitation as
the host does with real PMU (related CPUID/PERF_MSR expose this part of
information to guest).
3. Guest perf scheduler would make sure the guest event constraints
could dance with right guest model number.
4. vPMU would make sure the guest vPMC get the right guest model number
by hard-code EVENT_PINNED or just fail with creation.
5. This patch directly apply the guest hwc->config_base value to host
assigned hardware without consent from host perf(a bit deceptive but
practical for reducing the number of reprogram calls).
=== OR ====
If we insist on passing guest event constraints to host perf,
this proposal may need the following changes:
Because the guest configuration of hwc->config_base mostly only toggles
the enable bit of eventsel or fixctrl,it is not necessary to do
reprogram_counter because it's serving the same guest perf event.
The event creation is only needed when guest writes a complete new value
to eventsel or fixctrl.Codes for guest MSR_P6_EVNTSEL0 trap for example
may be modified to be like this:
u64 diff = pmc->eventsel ^ data;
if (intel_pmc_is_assigned(pmc)
&& diff != ARCH_PERFMON_EVENTSEL_ENABLE) {
intel_pmu_save_guest_pmc(pmu, pmc->idx);
intel_pmc_stop_counter(pmc);
}
reprogram_gp_counter(pmc, data);
Does this seem to satisfy our needs?
It makes everything easier to correct me if I'm wrong.