Re: MCG_CAP ABI breakage (was Re: [Qemu-devel] [PATCH] target-i386: Do not set MCG_SER_P by default)

Borislav Petkov <bp@xxxxxxxxx> · Mon, 23 Nov 2015 21:46:20 +0100

On Mon, Nov 23, 2015 at 05:42:08PM -0200, Eduardo Habkost wrote:
> I will let the people working on the actual MCE emulation in KVM
> answer that. I am assuming that KVM_MCE_CAP_SUPPORTED is set to
> something that makes sense.

Well, that should be, IMHO, the same like all those feature bits
assigned to the ->feature arrays of the different cpu types in qemu's
X86CPUDefinition descriptors.

> Note that we don't mimick every single detail of real CPUs out
> there, and this isn't necessarily a problem (although sometimes
> we choose bad defaults). Do you see real world scenarios when
> choosing 10 as the default causes problems for guest OSes, or you
> just worry that this might cause problems because it doesn't
> match any real-world CPU?

Well, the problems would come when the guests start using the MCA
infrastructure bits. That's why I asked how exactly do people imagine of
doing all the hardware errors handling in the guest.

I know we do something with poisoning pages, i.e.
kvm_send_hwpoison_signal() and all that machinery around it but in that
particular case it is the hypervisor which marks the pages as poison
and kvm notices that on the __get_user_pages() path and the error is
injected into the guest. AFAICT, of course.

In my case, I'm injecting a HW error in the guest kernel by writing into
the *guest* MSRs and the *guest* kernel MCA code is supposed to handle
the error.

And the problem here is that I'm emulating an AMD guest. But a guest
which sports an Intel-only feature and that puzzles the guest kernel.

Does that make more sense? I hope...

> If we really care about matching the number of banks of real
> CPUs, we can make it configurable, defined by the CPU model,
> and/or have better defaults in future machine-types. That won't
> be a problem.

I think we should try to do that if we're striving for accurate
emulation of guest CPUs. But then there's the migration use-case which
has different focus...

> But I still don't know what we should do when somebody runs:
>   -machine pc-i440fx-2.4 -cpu Penryn
> on a host kernel that doesn't report MCG_SER_P on
> KVM_MCE_CAP_SUPPORTED.

Right, before we ask that question we should ask the more generic one:
how do people want to do error handling in the guest? Do they even want
to? More importantly, does it even make sense to handle hardware errors
in the guest? If so, which and if not, why not?

I mean, no one would've noticed the MCG_SER_P issue if no one would've
tried to use it and what it implies. So it all comes down to whether the
guest uses the emulated feature.

It seems to me this issue remained unnoticed for such a long time now
for the simple reason that nothing used it. So nothing in the guest
cared whether SER_P is set or not, or how many MCA banks are there.

So if you run "-machine pc-i440fx-2.4 -cpu Penryn" it wouldn't matter
because, AFAIK - and correct me if I'm wrong here - the guest never
got to see the Action Required and Action Optional MCEs which are the
result from SER_P support. So the guest didn't care.

Yes, no, am I missing something completely here?

> I am just saying we already clear it when running on Linux
> v2.6.32-v2.6.36, it doesn't matter the host CPU or the -cpu
> options we have. And we do not clear it when running Linux
> v2.6.37 or newer. That's the behavior of pc-*-2.4 and older, even
> if we change it on future machine-types.

Right, ok. So the fact that it was clear in the v2.6.32-v2.6.36 frame
and set later and nothing complained, *probably* confirms my theory that
the guest didn't really care about that setting and it probably doesn't
do now either... Unless you try to use it, like I did :-)

Thanks.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html