Re: Deadlock due to EPT_VIOLATION

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Aug 14, 2023, Eric Wheeler wrote:
> On Tue, 8 Aug 2023, Sean Christopherson wrote:
> > > If you have any suggestions on how modifying the host kernel (and then migrating
> > > a locked up guest to it) or eBPF programs that might help illuminate the issue
> > > further, let me know!
> > > 
> > > Thanks for all your help so far!
> > 
> > Since it sounds like you can test with a custom kernel, try running with this
> > patch and then enable the kvm_page_fault tracepoint when a vCPU gets stuck.  The
> > below expands said tracepoint to capture information about mmu_notifiers and
> > memslots generation.  With luck, it will reveal a smoking gun.
> 
> Getting this patch into production systems is challenging, perhaps live
> patching is an option:

Ah, I take when you gathered information after a live migration you were migrating
VMs into a sidecar environment.

> Questions:
> 
> 1. Do you know if this would be safe to insert as a live kernel patch?

Hmm, probably not safe.

> For example, does adding to TRACE_EVENT modify a struct (which is not
> live-patch-safe) or is it something that should plug in with simple
> function redirection?

Yes, the tracepoint defines a struct, e.g. in this case trace_event_raw_kvm_page_fault.

Looking back, I think I misinterpreted an earlier response regarding bpftrace and
unnecessarily abandoned that tactic. *sigh*

If your environment provides btf info, then this bpftrace program should provide
the mmu_notifier half of the tracepoint hack-a-patch.  If this yields nothing
interesting then we can try diving into whether or not the mmu_root is stale, but
let's cross that bridge when we have to.

I recommend loading this only when you have a stuck vCPU, it'll be quite noisy.

kprobe:handle_ept_violation
{
	printf("vcpu = %lx pid = %u MMU seq = %lx, in-prog = %lx, start = %lx, end = %lx\n",
	       arg0, ((struct kvm_vcpu *)arg0)->pid->numbers[0].nr,
	       ((struct kvm_vcpu *)arg0)->kvm->mmu_invalidate_seq,
	       ((struct kvm_vcpu *)arg0)->kvm->mmu_invalidate_in_progress,
	       ((struct kvm_vcpu *)arg0)->kvm->mmu_invalidate_range_start,
	       ((struct kvm_vcpu *)arg0)->kvm->mmu_invalidate_range_end);
}

If you don't have BTF info, we can still use a bpf program, but to get at the
fields of interested, I think we'd have to resort to pointer arithmetic with struct
offsets grab from your build.



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux