Eduardo Habkost <ehabkost@xxxxxxxxxx> writes: > On Tue, Sep 22, 2020 at 05:38:12PM +0200, Vitaly Kuznetsov wrote: >> Eduardo Habkost <ehabkost@xxxxxxxxxx> writes: >> >> > This addresses the following crash when running Linux v5.8 >> > with kernel-irqchip=off: >> > >> > qemu-system-x86_64: error: failed to set MSR 0x4b564d02 to 0x0 >> > qemu-system-x86_64: ../target/i386/kvm.c:2714: kvm_buf_set_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed. >> > >> > There is a kernel-side fix for the issue too (kernel commit >> > d831de177217 "KVM: x86: always allow writing '0' to >> > MSR_KVM_ASYNC_PF_EN"), but it's nice to simply not trigger >> > the bug if running an older kernel. >> > >> > Fixes: https://bugs.launchpad.net/bugs/1896263 >> > Signed-off-by: Eduardo Habkost <ehabkost@xxxxxxxxxx> >> > --- >> > target/i386/kvm.c | 7 ++++++- >> > 1 file changed, 6 insertions(+), 1 deletion(-) >> > >> > diff --git a/target/i386/kvm.c b/target/i386/kvm.c >> > index 9efb07e7c83..1492f41349f 100644 >> > --- a/target/i386/kvm.c >> > +++ b/target/i386/kvm.c >> > @@ -2818,7 +2818,12 @@ static int kvm_put_msrs(X86CPU *cpu, int level) >> > kvm_msr_entry_add(cpu, MSR_IA32_TSC, env->tsc); >> > kvm_msr_entry_add(cpu, MSR_KVM_SYSTEM_TIME, env->system_time_msr); >> > kvm_msr_entry_add(cpu, MSR_KVM_WALL_CLOCK, env->wall_clock_msr); >> > - if (env->features[FEAT_KVM] & (1 << KVM_FEATURE_ASYNC_PF)) { >> > + /* >> > + * Some kernel versions (v5.8) won't let MSR_KVM_ASYNC_PF_EN to be set >> > + * at all if kernel-irqchip=off, so don't try to set it in that case. >> > + */ >> > + if (env->features[FEAT_KVM] & (1 << KVM_FEATURE_ASYNC_PF) && >> > + kvm_irqchip_in_kernel()) { >> > kvm_msr_entry_add(cpu, MSR_KVM_ASYNC_PF_EN, env->async_pf_en_msr); >> > } >> > if (env->features[FEAT_KVM] & (1 << KVM_FEATURE_PV_EOI)) { >> >> I'm not sure kvm_irqchip_in_kernel() was required before we switched to >> interrupt-based APF (as we were always injecting #PF) but with >> kernel-5.8+ this should work. [...] > > Were guests able to set MSR_KVM_ASYNC_PF_EN to non-zero with > kernel-irqchip=off on hosts running Linux <= 5.7? lapic_in_kernel() check appeared in kernel with the following commit: commit 9d3c447c72fb2337ca39f245c6ae89f2369de216 Author: Wanpeng Li <wanpengli@xxxxxxxxxxx> Date: Mon Jun 29 18:26:31 2020 +0800 KVM: X86: Fix async pf caused null-ptr-deref which was post-interrupt-based-APF. I *think* it was OK to enable APF with !lapic_in_kernel() before (at least I don't see what would not allow that). > I am assuming > kvm-asyncpf never worked with kernel-irqchip=off (and enabling it > by default with kernel-irqchip=off was a mistake). > > >> [...] We'll need to merge this with >> >> https://lists.nongnu.org/archive/html/qemu-devel/2020-09/msg02963.html >> (queued by Paolo) and >> https://lists.nongnu.org/archive/html/qemu-devel/2020-09/msg06196.html >> which fixes a bug in it. >> >> as kvm_irqchip_in_kernel() should go around both KVM_FEATURE_ASYNC_PF >> and KVM_FEATURE_ASYNC_PF_INT I believe. > > Shouldn't we just disallow kvm-asyncpf-int=on if kernel-irqchip=off? (Sarcasm: if disallowing 'kernel-irqchip=off' is not an option, then) yes, we probably can, but kvm-asyncpf-int=on is the default we have so we can't just implicitly change it underneath or migration will break... -- Vitaly