On Mon, Nov 23, 2015 at 01:11:27PM -0200, Eduardo Habkost wrote: > On Mon, Nov 23, 2015 at 11:22:37AM -0200, Eduardo Habkost wrote: > [...] > > In the case of this code, it looks like it's already broken > > because the resulting mcg_cap depends on host kernel capabilities > > (the ones reported by kvm_get_mce_cap_supported()), and the data > > initialized by target-i386/cpu.c:mce_init() is silently > > overwritten by kvm_arch_init_vcpu(). So we would need to fix that > > before implementing a proper compatibility mechanism for > > mcg_cap. > > Fortunately, when running Linux v2.6.37 and later, > kvm_arch_init_vcpu() won't actually change mcg_cap (see details > below). > > But the code is broken if running on Linux between v2.6.32 and > v2.6.36: it will clear MCG_SER_P silently (and silently enable > MCG_SER_P when migrating to a newer host). > > But I don't know what we should do on those cases. If we abort > initialization when the host doesn't support MCG_SER_P, all CPU > models with MCE and MCA enabled will become unrunnable on Linux > between v2.6.32 and v2.6.36. Should we do that, and simply ask > people to upgrade their kernels (or explicitly disable MCE) if > they want to run latest QEMU? > > For reference, these are the capabilities returned by Linux: > * KVM_MAX_MCE_BANKS is 32 since > 890ca9aefa78f7831f8f633cab9e4803636dffe4 (v2.6.32-rc1~693^2~199) > * KVM_MCE_CAP_SUPPORTED is (MCG_CTL_P | MCG_SER_P) since > 5854dbca9b235f8cdd414a0961018763d2d5bf77 (v2.6.37-rc1~142^2~3) The commit message of that one says that there is MCG_SER_P support in the kernel. The previous commit talks about MCE injection with KVM_X86_SET_MCE ioctl but frankly, I don't see that. From looking at the current code, KVM_X86_SET_MCE does kvm_vcpu_ioctl_x86_setup_mce() which simply sets MCG_CAP. And it gets those from KVM_X86_GET_MCE_CAP_SUPPORTED which is #define KVM_MCE_CAP_SUPPORTED (MCG_CTL_P | MCG_SER_P) So it basically sets those two supported bits. But how is supported == actually present ?!?! That soo doesn't make any sense. > * KVM_MCE_CAP_SUPPORTED is MCG_CTL_P between > 890ca9aefa78f7831f8f633cab9e4803636dffe4 (v2.6.32-rc1~693^2~199) > and 5854dbca9b235f8cdd414a0961018763d2d5bf77 (v2.6.37-rc1~142^2~3) > > The current definitions in QEMU are: > #define MCE_CAP_DEF (MCG_CTL_P|MCG_SER_P) > #define MCE_BANKS_DEF 10 That's also wrong. The number of banks is, of course, generation-specific. The qemu commit adding that is 79c4f6b08009 ("QEMU: MCE: Add MCE simulation to qemu/tcg") and I really don't get what the reasoning behind it is? Is this supposed to mimick a certain default CPU which is not really resembling a real CPU or some generic CPU or what is it? Because the moment you start qemu with -cpu <something AMD>, all that MCA information is totally wrong. > The target-i386/cpu.c:mce_init() code sets mcg_cap to: > env->mcg_cap == MCE_CAP_DEF | MCE_BANKS_DEF; > == (MCG_CTL_P|MCG_SER_P) | 10; > > The kvm_arch_init_vcpu() code that changes mcg_cap does the following: > kvm_get_mce_cap_supported(cs->kvm_state, &mcg_cap, &banks); > if (banks > MCE_BANKS_DEF) { > banks = MCE_BANKS_DEF; > } > mcg_cap &= MCE_CAP_DEF; > mcg_cap |= banks; > env->mcg_cap = mcg_cap; > * Therefore, if running Linux v2.6.37 or newer, this will be > the result: > banks == 10; > mcg_cap == (MCG_CTL_P|MCG_SER_P) | banks > == (MCG_CTL_P|MCG_SER_P) | 10; > * That's the same value set by mce_init(), so fortunately > kvm_arch_init_vcpu() isn't actually changing mcg_cap > if running Linux v.2.6.37 and newer. > * However, if running Linux between v2.6.32 and v2.6.37, > kvm_arch_init_vcpu() will silently clear MCG_SER_P. I don't understand - you want that cleared when emulating a !Intel CPU or an Intel CPU which doesn't support SER. This bit should be clear there. Why should be set at all there? Hmmm? -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html