On Thu, 2024-12-19 at 18:33 +0100, Paolo Bonzini wrote: > On 9/10/24 22:03, Maxim Levitsky wrote: > > Add 3 new tracepoints for nested VM exits which are intended > > to capture extra information to gain insights about the nested guest > > behavior. > > > > The new tracepoints are: > > > > - kvm_nested_msr > > - kvm_nested_hypercall > > > > These tracepoints capture extra register state to be able to know > > which MSR or which hypercall was done. > > > > - kvm_nested_page_fault > > > > This tracepoint allows to capture extra info about which host pagefault > > error code caused the nested page fault. > > > > Signed-off-by: Maxim Levitsky <mlevitsk@xxxxxxxxxx> > > --- > > arch/x86/kvm/svm/nested.c | 22 +++++++++++ > > arch/x86/kvm/trace.h | 82 +++++++++++++++++++++++++++++++++++++-- > > arch/x86/kvm/vmx/nested.c | 27 +++++++++++++ > > arch/x86/kvm/x86.c | 3 ++ > > 4 files changed, 131 insertions(+), 3 deletions(-) > > > > diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c > > index 6f704c1037e51..2020307481553 100644 > > --- a/arch/x86/kvm/svm/nested.c > > +++ b/arch/x86/kvm/svm/nested.c > > @@ -38,6 +38,8 @@ static void nested_svm_inject_npf_exit(struct kvm_vcpu *vcpu, > > { > > struct vcpu_svm *svm = to_svm(vcpu); > > struct vmcb *vmcb = svm->vmcb; > > + u64 host_error_code = vmcb->control.exit_info_1; > > + > > > > if (vmcb->control.exit_code != SVM_EXIT_NPF) { > > /* > > @@ -48,11 +50,15 @@ static void nested_svm_inject_npf_exit(struct kvm_vcpu *vcpu, > > vmcb->control.exit_code_hi = 0; > > vmcb->control.exit_info_1 = (1ULL << 32); > > vmcb->control.exit_info_2 = fault->address; > > + host_error_code = 0; > > } > > > > vmcb->control.exit_info_1 &= ~0xffffffffULL; > > vmcb->control.exit_info_1 |= fault->error_code; > > > > + trace_kvm_nested_page_fault(fault->address, host_error_code, > > + fault->error_code); > > + > > I disagree with Sean about trace_kvm_nested_page_fault. It's a useful > addition and it is easier to understand what's happening with a > dedicated tracepoint (especially on VMX). > > Tracepoint are not an exact science and they aren't entirely kernel API. > At least they can just go away at any time (changing them is a lot > more tricky, but their presence is not guaranteed). The one below has > the slight ugliness of having to do some computation in > nested_svm_vmexit(), this one should go in. > > > nested_svm_vmexit(svm); > > } > > > > @@ -1126,6 +1132,22 @@ int nested_svm_vmexit(struct vcpu_svm *svm) > > vmcb12->control.exit_int_info_err, > > KVM_ISA_SVM); > > > > + /* Collect some info about nested VM exits */ > > + switch (vmcb12->control.exit_code) { > > + case SVM_EXIT_MSR: > > + trace_kvm_nested_msr(vmcb12->control.exit_info_1 == 1, > > + kvm_rcx_read(vcpu), > > + (vmcb12->save.rax & 0xFFFFFFFFull) | > > + (((u64)kvm_rdx_read(vcpu) << 32))); > > + break; > > + case SVM_EXIT_VMMCALL: > > + trace_kvm_nested_hypercall(vmcb12->save.rax, > > + kvm_rbx_read(vcpu), > > + kvm_rcx_read(vcpu), > > + kvm_rdx_read(vcpu)); > > + break; > > Here I probably would have preferred an unconditional tracepoint giving > RAX/RBX/RCX/RDX after a nested vmexit. This is not exactly what Sean > wanted but perhaps it strikes a middle ground? I know you wrote this > for a debugging tool, do you really need to have everything in a single > tracepoint, or can you correlate the existing exit tracepoint with this > hypothetical trace_kvm_nested_exit_regs, to pick RDMSR vs. WRMSR? Hi! If the new trace_kvm_nested_exit_regs tracepoint has a VM exit number argument, then I can enable this new tracepoint twice with a different filter (vm_exit_num number == msr and vm_exit_num == vmcall), and each instance will count the events that I need. So this can work. Thanks! Best regards, Maxim Levitsky > > Paolo >