On 18/1/2022 12:08 pm, Jim Mattson wrote:
On Mon, Jan 17, 2022 at 12:57 PM Jim Mattson <jmattson@xxxxxxxxxx> wrote:
On Sun, Jan 16, 2022 at 8:26 PM Like Xu <like.xu.linux@xxxxxxxxx> wrote:
...
It's easy for KVM to clear the reserved bit PERF_CTL2[43]
for only (AMD Family 19h Models 00h-0Fh) guests.
KVM is currently *way* too aggressive about synthesizing #GP for
"reserved" bits on AMD hardware. Note that "reserved" generally has a
much weaker definition in AMD documentation than in Intel
documentation. When Intel says that an MSR bit is "reserved," it means
that an attempt to set the bit will raise #GP. When AMD says that an
MSR bit is "reserved," it does not necessarily mean the same thing.
I agree. And I'm curious as to why there are hardly any guest user complaints.
The term "reserved" is described in the AMD "Conventions and Definitions":
Fields marked as reserved may be used at some future time.
To preserve compatibility with future processors, reserved fields require
special handling when
read or written by software. Software must not depend on the state of a
reserved field (unless
qualified as RAZ), nor upon the ability of such fields to return a previously
written state.
If a field is marked reserved *without qualification*, software must not change
the state of
that field; it must reload that field with the same value returned from a prior
read.
Reserved fields may be qualified as IGN, MBZ, RAZ, or SBZ.
For AMD, #GP comes from "Writing 1 to any bit that must be zero (MBZ) in the MSR."
(Usually, AMD will write MBZ to indicate that the bit must be zero.)
On my Zen3 CPU, I can write 0xffffffffffffffff to MSR 0xc0010204,
without getting a #GP. Hence, KVM should not synthesize a #GP for any
writes to this MSR.
; storage behind bit 43 test
; CPU family: 25
; Model: 1
wrmsr -p 0 0xc0010204 0x80000000000
rdmsr -p 0 0xc0010204 # return 0x80000000000
Note that the value I get back from rdmsr is 0x30fffdfffff, so there
appears to be no storage behind bit 43. If KVM allows this bit to be
set, it should ensure that reads of this bit always return 0, as they
do on hardware.
The PERF_CTL2[43] is marked reserved without qualification in the in Figure 13-7.
I'm not sure we really need a cleanup storm of #GP for all SVM's non-MBZ
reserved bits.
Bit 19 (Intel's old Pin Control bit) seems to have storage behind it.
It is interesting that in Figure 13-7 "Core Performance Event-Select
Register (PerfEvtSeln)" of the APM volume 2, this "reserved" bit is
not marked in grey. The remaining "reserved" bits (which are marked in
grey), should probably be annotated with "RAZ."
In any diagram, we at least have three types of "reservation":
- Reserved + grey
- Reserved, MBZ + grey
- Reserved + no grey
So it is better not to think of "Reserved + grey" as "Reserved, MBZ + grey".