On 11/11/22 11:45, Peter Zijlstra wrote:
What is "correct" in this context?
I don't know since I don't really speak virt, but I could image the
regset that would match the vmrun (or whatever intel decided to call
that again) instruction.
Right now it is not exactly that but close. The RIP is somewhere in
vmx_do_interrupt_nmi_irqoff; CS/SS are correct (i.e. it's not like they
point to guest values!) and other registers including RSP and RFLAGS are
consistent with the RIP.
Currently KVM basically stuff random data into pt_regs; this at least
makes it explicitly zero.
🙁 Both is broken. Once again proving to me that virt is a bunch of
duck-tape at best.
Except currently it is not random. At least I'm not alone in sometimes
thinking I understand stuff when I actually don't.
Zero is just wrong, I agree. Xin, if you don't want to poke at the IDT
you need to build the pt_regs and pass them to the function you add in
patch 5. In order to make it like Peter wants it, then:
1) for the flags just use X86_EFLAGS_FIXED
2) for CS/SS segment selectors use __KERNEL_CS and __KERNEL_DS
3) the RIP/RSP can be found respectively in (unsigned long)vmx_vmexit
vmx->loaded_vmcs->host_state.rsp
4) the other registers can be taken from vcpu->arch.regs
But I am not sure it's an improvement. It may be okay to zero the
registers, but setting CS/RIP/SS/RSP/RFLAGS to the actual processor
values in vmx_do_interrupt_nmi_irqoff is safer.
I'm not sure who would set orig_rax, but I haven't looked too closely.
Perhaps that's not a problem, but if so it has to be documented in the
commit message.
Paolo