On Mon, Aug 14, 2023, Eric Wheeler wrote: > On Tue, 8 Aug 2023, Sean Christopherson wrote: > > > If you have any suggestions on how modifying the host kernel (and then migrating > > > a locked up guest to it) or eBPF programs that might help illuminate the issue > > > further, let me know! > > > > > > Thanks for all your help so far! > > > > Since it sounds like you can test with a custom kernel, try running with this > > patch and then enable the kvm_page_fault tracepoint when a vCPU gets stuck. The > > below expands said tracepoint to capture information about mmu_notifiers and > > memslots generation. With luck, it will reveal a smoking gun. > > Getting this patch into production systems is challenging, perhaps live > patching is an option: Ah, I take when you gathered information after a live migration you were migrating VMs into a sidecar environment. > Questions: > > 1. Do you know if this would be safe to insert as a live kernel patch? Hmm, probably not safe. > For example, does adding to TRACE_EVENT modify a struct (which is not > live-patch-safe) or is it something that should plug in with simple > function redirection? Yes, the tracepoint defines a struct, e.g. in this case trace_event_raw_kvm_page_fault. Looking back, I think I misinterpreted an earlier response regarding bpftrace and unnecessarily abandoned that tactic. *sigh* If your environment provides btf info, then this bpftrace program should provide the mmu_notifier half of the tracepoint hack-a-patch. If this yields nothing interesting then we can try diving into whether or not the mmu_root is stale, but let's cross that bridge when we have to. I recommend loading this only when you have a stuck vCPU, it'll be quite noisy. kprobe:handle_ept_violation { printf("vcpu = %lx pid = %u MMU seq = %lx, in-prog = %lx, start = %lx, end = %lx\n", arg0, ((struct kvm_vcpu *)arg0)->pid->numbers[0].nr, ((struct kvm_vcpu *)arg0)->kvm->mmu_invalidate_seq, ((struct kvm_vcpu *)arg0)->kvm->mmu_invalidate_in_progress, ((struct kvm_vcpu *)arg0)->kvm->mmu_invalidate_range_start, ((struct kvm_vcpu *)arg0)->kvm->mmu_invalidate_range_end); } If you don't have BTF info, we can still use a bpf program, but to get at the fields of interested, I think we'd have to resort to pointer arithmetic with struct offsets grab from your build.