On 2/8/2024 11:31 AM, Paolo Bonzini wrote: > On Thu, Jan 25, 2024 at 1:59 AM Yunhong Jiang > <yunhong.jiang@xxxxxxxxxxxxxxx> wrote: >> Would it be ok to move the followed emulator_leave_smm() code into >> vmx_leave_smm, before setting nested_run_pending bit? It avoids changing >> the generic emulator code. >> >> #ifdef CONFIG_X86_64 >> if (guest_cpuid_has(vcpu, X86_FEATURE_LM)) >> return rsm_load_state_64(ctxt, &smram.smram64); >> else >> #endif >> return rsm_load_state_32(ctxt, &smram.smram32); > > I agree with Michal that vendor code should not be in charge of > calling rsm_load_state_*. > > However, I don't understand the problem or the fix. > > The possible causes are two, and in neither case the fix is to clear > nested_run_pending: > > 1) if the problem is that setting nested_run_pending was premature, > the correct fix in my opinion is to split the leave_smm vendor > callback in "prepare" and "commit" parts; not to clear it later. See > the attached, untested, patch. Hi, I've tested the patch and it seems to work, both on Intel and AMD. There was a problem with applying this chunk though: diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h index ac8b7614e79d..3d18fa7db353 100644 --- a/arch/x86/include/asm/kvm-x86-ops.h +++ b/arch/x86/include/asm/kvm-x86-ops.h @@ -119,7 +119,8 @@ KVM_X86_OP(setup_mce) #ifdef CONFIG_KVM_SMM KVM_X86_OP(smi_allowed) KVM_X86_OP() // <- This shouldn't be there I guess ? -KVM_X86_OP(leave_smm) +KVM_X86_OP(leave_smm_prepare) +KVM_X86_OP(leave_smm_commit) KVM_X86_OP(enable_smi_window) #endif KVM_X86_OP_OPTIONAL(dev_get_attr) Anyway I was a bit averse to this approach as I noticed in the git log that callbacks like e.g post_leave_smm() used to exist, but they were later removed, so I though the maintainers don't like introducing extra callbacks. > > 2) otherwise, if the problem is that we have not gone through the > vmenter yet, then KVM needs to do that and _then_ inject the triple > fault. The fix is to merge the .triple_fault and .check_nested_events > callbacks, with something like the second attached patch - which > probably has so many problems that I haven't even tried to compile it. Well, in this case if we know that RSM will fail it doesn't seem to me like it make sense to run vmenter just do kill the VM anyway, this would be more confusing. I've made the fix this way based on our discussion with Sean in v1, and tried to mark the RSM instruction with a flag, as a one that needs actual HW VMenter to complete succesfully, and based on that information manipulate nested_run_pending. > > The first patch should be equivalent to yours, and I guess it is > okay-ish with a few more comments that explain what's going on. Sure, I'm fine with this. Thanks ! > > Paolo