On 4/23/2024 9:01 AM, maobibo wrote:
On 2024/4/23 上午1:01, Sean Christopherson wrote:
On Mon, Apr 22, 2024, maobibo wrote:
On 2024/4/16 上午6:45, Sean Christopherson wrote:
On Mon, Apr 15, 2024, Mingwei Zhang wrote:
On Mon, Apr 15, 2024 at 10:38 AM Sean Christopherson
<seanjc@xxxxxxxxxx> wrote:
One my biggest complaints with the current vPMU code is that the
roles and
responsibilities between KVM and perf are poorly defined, which
leads to suboptimal
and hard to maintain code.
Case in point, I'm pretty sure leaving guest values in PMCs
_would_ leak guest
state to userspace processes that have RDPMC permissions, as the
PMCs might not
be dirty from perf's perspective (see perf_clear_dirty_counters()).
Blindly clearing PMCs in KVM "solves" that problem, but in doing
so makes the
overall code brittle because it's not clear whether KVM _needs_
to clear PMCs,
or if KVM is just being paranoid.
So once this rolls out, perf and vPMU are clients directly to PMU HW.
I don't think this is a statement we want to make, as it opens a
discussion
that we won't win. Nor do I think it's one we *need* to make. KVM
doesn't need
to be on equal footing with perf in terms of owning/managing PMU
hardware, KVM
just needs a few APIs to allow faithfully and accurately
virtualizing a guest PMU.
Faithful cleaning (blind cleaning) has to be the baseline
implementation, until both clients agree to a "deal" between them.
Currently, there is no such deal, but I believe we could have one via
future discussion.
What I am saying is that there needs to be a "deal" in place before
this code
is merged. It doesn't need to be anything fancy, e.g. perf can
still pave over
PMCs it doesn't immediately load, as opposed to using
cpu_hw_events.dirty to lazily
do the clearing. But perf and KVM need to work together from the
get go, ie. I
don't want KVM doing something without regard to what perf does,
and vice versa.
There is similar issue on LoongArch vPMU where vm can directly pmu
hardware
and pmu hw is shard with guest and host. Besides context switch
there are
other places where perf core will access pmu hw, such as tick
timer/hrtimer/ipi function call, and KVM can only intercept context
switch.
Two questions:
1) Can KVM prevent the guest from accessing the PMU?
2) If so, KVM can grant partial access to the PMU, or is it all or
nothing?
If the answer to both questions is "yes", then it sounds like
LoongArch *requires*
mediated/passthrough support in order to virtualize its PMU.
Hi Sean,
Thank for your quick response.
yes, kvm can prevent guest from accessing the PMU and grant partial or
all to access to the PMU. Only that if one pmu event is granted to VM,
host can not access this pmu event again. There must be pmu event
switch if host want to.
PMU event is a software entity which won't be shared. did you mean if a
PMU HW counter is granted to VM, then Host can't access the PMU HW
counter, right?
Can we add callback handler in structure kvm_guest_cbs? just like
this:
@@ -6403,6 +6403,7 @@ static struct perf_guest_info_callbacks
kvm_guest_cbs
= {
.state = kvm_guest_state,
.get_ip = kvm_guest_get_ip,
.handle_intel_pt_intr = NULL,
+ .lose_pmu = kvm_guest_lose_pmu,
};
By the way, I do not know should the callback handler be triggered
in perf
core or detailed pmu hw driver. From ARM pmu hw driver, it is
triggered in
pmu hw driver such as function kvm_vcpu_pmu_resync_el0,
but I think it will be better if it is done in perf core.
I don't think we want to take the approach of perf and KVM guests
"fighting" over
the PMU. That's effectively what we have today, and it's a mess for
KVM because
it's impossible to provide consistent, deterministic behavior for the
guest. And
it's just as messy for perf, which ends up having wierd, cumbersome
flows that
exists purely to try to play nice with KVM.
With existing pmu core code, in tick timer interrupt or IPI function
call interrupt pmu hw may be accessed by host when VM is running and
pmu is already granted to guest. KVM can not intercept host IPI/timer
interrupt, there is no pmu context switch, there will be problem.
Regards
Bibo Mao