On Tue, Jul 20, 2021, Brijesh Singh wrote: > > On 7/19/21 7:10 PM, Sean Christopherson wrote: > > On Wed, Jul 07, 2021, Brijesh Singh wrote: > > > Follow the recommendation from APM2 section 15.36.10 and 15.36.11 to > > > resolve the RMP violation encountered during the NPT table walk. > > > > Heh, please elaborate on exactly what that recommendation is. A recommendation > > isn't exactly architectural, i.e. is subject to change :-) > > I will try to expand it :) > > > > > And, do we have to follow the APM's recommendation? > > Yes, unless we want to be very strict on what a guest can do. > > > Specifically, can KVM treat #NPF RMP violations as guest errors, or is that > > not allowed by the GHCB spec? > > The GHCB spec does not say anything about the #NPF RMP violation error. And > not all #NPF RMP is a guest error (mainly those size mismatch etc). > > > I.e. can we mandate accesses be preceded by page state change requests? > > This is a good question, the GHCB spec does not enforce that a guest *must* > use page state. If the page state changes is not done by the guest then it > will cause #NPF and its up to the hypervisor to decide on what it wants to > do. Drat. Is there any hope of pushing through a GHCB change to require the guest to use PSC? > > It would simplify KVM (albeit not much of a simplificiation) and would also > > make debugging easier since transitions would require an explicit guest > > request and guest bugs would result in errors instead of random > > corruption/weirdness. > > I am good with enforcing this from the KVM. But the question is, what fault > we should inject in the guest when KVM detects that guest has issued the > page state change. Injecting a fault, at least from KVM, isn't an option since there's no architectural behavior we can leverage. E.g. a guest that isn't enlightened enough to properly use PSC isn't going to do anything useful with a #MC or #VC. Sadly, as is I think our only options are to either automatically convert RMP entries as need, or to punt the exit to userspace. Maybe we could do both, e.g. have a module param to control the behavior? The problem with punting to userspace is that KVM would also need a way for userspace to fix the issue, otherwise we're just taking longer to kill the guest :-/