> -----Original Message----- > From: Paolo Bonzini <pbonzini@xxxxxxxxxx> > Sent: Friday, November 11, 2022 3:58 AM > To: Peter Zijlstra <peterz@xxxxxxxxxxxxx>; H. Peter Anvin <hpa@xxxxxxxxx> > Cc: Christopherson,, Sean <seanjc@xxxxxxxxxx>; Li, Xin3 <xin3.li@xxxxxxxxx>; > linux-kernel@xxxxxxxxxxxxxxxx; x86@xxxxxxxxxx; kvm@xxxxxxxxxxxxxxx; > tglx@xxxxxxxxxxxxx; mingo@xxxxxxxxxx; bp@xxxxxxxxx; > dave.hansen@xxxxxxxxxxxxxxx; Tian, Kevin <kevin.tian@xxxxxxxxx> > Subject: Re: [PATCH 5/6] KVM: x86/VMX: add kvm_vmx_reinject_nmi_irq() for > NMI/IRQ reinjection > > On 11/11/22 11:45, Peter Zijlstra wrote: > >> What is "correct" in this context? > > > > I don't know since I don't really speak virt, but I could image the > > regset that would match the vmrun (or whatever intel decided to call > > that again) instruction. > > Right now it is not exactly that but close. The RIP is somewhere in > vmx_do_interrupt_nmi_irqoff; CS/SS are correct (i.e. it's not like they point to > guest values!) and other registers including RSP and RFLAGS are consistent with > the RIP. > > >> Currently KVM basically stuff random data into pt_regs; this at least > >> makes it explicitly zero. > > > > 🙁 Both is broken. Once again proving to me that virt is a bunch of > > duck-tape at best. > > Except currently it is not random. At least I'm not alone in sometimes thinking I > understand stuff when I actually don't. > > Zero is just wrong, I agree. Xin, if you don't want to poke at the IDT you need > to build the pt_regs and pass them to the function you add in patch 5. In order > to make it like Peter wants it, then: > > 1) for the flags just use X86_EFLAGS_FIXED > > 2) for CS/SS segment selectors use __KERNEL_CS and __KERNEL_DS > > 3) the RIP/RSP can be found respectively in (unsigned long)vmx_vmexit > vmx->loaded_vmcs->host_state.rsp > > 4) the other registers can be taken from vcpu->arch.regs > > But I am not sure it's an improvement. It may be okay to zero the registers, but > setting CS/RIP/SS/RSP/RFLAGS to the actual processor values in > vmx_do_interrupt_nmi_irqoff is safer. > > I'm not sure who would set orig_rax, but I haven't looked too closely. > Perhaps that's not a problem, but if so it has to be documented in the commit > message. No IRQ handler uses a pt_regs structure as an argument, while NMI handlers do. For NMI based profiling, what it wants maybe is more of the guest RIP? I might make it over complicated by mixing host/guest profiling. > Paolo