Re: [PATCH] KVM: x86/svm: Clear reserved bits written to PerfEvtSeln MSRs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Feb 28, 2022 at 12:27 AM Like Xu <like.xu.linux@xxxxxxxxx> wrote:
>
> On 27/2/2022 7:41 am, Jim Mattson wrote:
> > AMD EPYC CPUs never raise a #GP for a WRMSR to a PerfEvtSeln MSR. Some
> > reserved bits are cleared, and some are not. Specifically, on
> > Zen3/Milan, bits 19 and 42 are not cleared.
>
> Curiously, is there any additional documentation on what bits 19 and 42 are for?
> And we only need this part of logic specifically for at least (guest cpu model)
> Zen3.

With the help of an older revision of the APM I found at
https://www.ii.uib.no/~osvik/x86-64/24593.pdf, we can see that bit 19,
on AMD as well as Intel, is the deprecated "Pin Control" bit. I
believe bit 42 is new on Zen3/Milan, but aside from being useful for
fixing erratum #1292, I don't have any idea what it does. Note that
bits 40 and 41 were reserved bits before SVM was introduced, and
should be treated as such for VMs that do not support SVM. Hence, the
motivation for this change is still, as previously mentioned, the
egregious behavior of the Intel perf subsystem with respect to the
Host-Only bit. This is necessary for all AMD vCPUs that do not support
SVM, regardless of model.

> >
> > When emulating such a WRMSR, KVM should not synthesize a #GP, > regardless of which bits are set. However, undocumented bits should
>
> If KVM chooses to emulate different #GP behavior on AMD and Intel for
> "reserved bits without qualification"[0], there should be more code for almost
> all MSRs to be checked one by one.

I think you are manufacturing a problem that doesn't exist.

> [0] "If a field is marked reserved without qualification, software must not
> change the state of that field; it must reload that field with the same value
> returned from a prior read."

Unfortunately, some software (e.g. Linux perf) ignores this
restriction. If, in spite of its misbehavior, the software works fine
on bare metal, we should do whatever is necessary to make it work in a
VM as well.

> > not be passed through to the hardware MSR. So, rather than checking
> > for reserved bits and synthesizing a #GP, just clear the reserved
> > bits.
>
> wrmsr -a 0xc0010200 0xfffffcf000280000
> rdmsr -a 0xc0010200 | sort | uniq
> # 0x40000080000 (expected)
>
> According to the test, there will be memory bits somewhere on the host
> recording the bit status of bits 19 and 42.
>
> Shouldn't KVM emulate this bit-memory behavior as well ?

I'm happy to revert your change that added bit 19 to the reserved
bits. I can remove bit 42 as well, but I don't see the need. Bit 42,
unlike bit 19, has never been documented.

> >
> > This may seem pedantic, but since KVM currently does not support the
> > "Host/Guest Only" bits (41:40), it is necessary to clear these bits
>
> I would have thought you had code to emulate the "Host/Guest Only"
> bits for nested SVM PMU to fix this issue fundamentally.

GCP doesn't support nested SVM at all, so we have no such code.
Regardless, as you can see from the old APM referenced above, these
bits were reserved on AMD CPUs that don't support SVM. They should
also be reserved on virtual CPUs that don't support SVM. That much, at
least, KVM gets right today.

> > rather than synthesizing #GP, because some popular guests (e.g Linux)
> > will set the "Host Only" bit even on CPUs that don't support
> > EFER.SVME, and they don't expect a #GP.
>
> IMO, this fix is just a reprieve.
>
> The logic of special handling of #GP only for AMD PMU MSR's
> "reserved without qualification" bits is asymmetric in the KVM/svm
> context and will confuse users even more.

I'm happy to entertain alternative suggestions.



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux