Dear Sean and all, I tried to re-inject the #BP into the guest from userspace using the KVM_SET_VCPU_EVENTS ioctl. However, I see that after some time, the VMEXITs stop happening (due to KVM_EXIT_DEBUG). I am suspicious that the guest RIP hasn't been updated when I re-inject the #BP exception. The kprobe logs tell me that the kprobe functions are being accessed quite frequently (I am attaching a kprobe at "free_one_page()" and I do not expect it to be called almost every microsecond). This is the code I wrote in userspace to re-inject the #BP interrupt into the guest. The below code is from QEMU version 5.0.1 switch (run->exit_reason) { /* other exit reason handling */ case KVM_EXIT_DEBUG: struct kvm_vcpu_events events = {}; events.exception.nr = run->debug.arch.exception; events.exception.has_error_code = 0; events.exception.pending = 1; events.exception.injected = 1; events.exception.error_code = 0; if (kvm_vcpu_ioctl(cpu, KVM_SET_VCPU_EVENTS, &events) < 0) printf("Error while doing ioctl KVM_SET_VCPU_EVENTS"); ret = 0; break: Do you see a need to initialize any other structure member in kvm_vcpu_events{} ? Do I need to change any of the structure member values that I am passing to the ioctl command? Why is the RIP still not updating ? Thank you very much for your answer again. Best Regards, Arnabjyoti Kalita On Wed, May 11, 2022 at 10:32 PM Arnabjyoti Kalita <akalita@xxxxxxxxxxxxxxxxx> wrote: > > Thank you for your answer, Sean. > > I think I now have a fair idea on how to proceed. I will re-inject the > #BP into the guest from KVM and see what happens. I'm hoping the guest > will handle the #BP and continue execution without me needing to make > any more changes. > > Best Regards, > Arnabjyoti Kalita > > On Wed, May 11, 2022 at 8:34 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > > > On Wed, May 11, 2022, Arnabjyoti Kalita wrote: > > > What could be the various ways a guest could handle #BP? > > > > The kernel uses INT3 to patch instructions/flows, e.g. for alternatives. For those, > > the INT3 handler will unwind to the original RIP and retry. The #BP will keep > > occurring until the patching completes. See text_poke_bp_batch(), poke_int3_handler(), > > etc... > > > > Userspace debuggers will do something similar; after catching the #BP, the original > > instruction is restored and restarted. > > > > The reason INT3 is a single byte is so that software can "atomically" trap/patch an > > instruction without having to worry about cache line splits. CPUs are guaranteed > > to either see the INT3 or the original instruction in its entirety, i.e. other CPUs > > will never decode a half-baked instruction. > > > > The kernel has even fancier uses for things like static_call(), e.g. emulating > > CALL, RET, and JMP from the #BP handler. > > > > > Can we "make" the guest skip the instruction that caused the #BP ? > > > > Well, technically yes, that's effectively what would happen if the host skips the > > INT3 and doesn't inject the #BP. Can you do that and expect the guest not to > > crash? Nope.