On Fri, May 10, 2024 at 06:01:52PM +0200, Paolo Bonzini wrote: > On 5/10/24 15:58, Sean Christopherson wrote: > > On Thu, May 09, 2024, Michael Roth wrote: > > > The intended logic when handling #NPFs with the RMP bit set (31) is to > > > first check to see if the #NPF requires a shared<->private transition > > > and, if so, to go ahead and let the corresponding KVM_EXIT_MEMORY_FAULT > > > get forwarded on to userspace before proceeding with any handling of > > > other potential RMP fault conditions like needing to PSMASH the RMP > > > entry/etc (which will be done later if the guest still re-faults after > > > the KVM_EXIT_MEMORY_FAULT is processed by userspace). > > > > > > The determination of whether any userspace handling of > > > KVM_EXIT_MEMORY_FAULT is needed is done by interpreting the return code > > > of kvm_mmu_page_fault(). However, the current code misinterprets the > > > return code, expecting 0 to indicate a userspace exit rather than less > > > than 0 (-EFAULT). This leads to the following unexpected behavior: > > > > > > - for KVM_EXIT_MEMORY_FAULTs resulting for implicit shared->private > > > conversions, warnings get printed from sev_handle_rmp_fault() > > > because it does not expect to be called for GPAs where > > > KVM_MEMORY_ATTRIBUTE_PRIVATE is not set. Standard linux guests don't > > > generally do this, but it is allowed and should be handled > > > similarly to private->shared conversions rather than triggering any > > > sort of warnings > > > > > > - if gmem support for 2MB folios is enabled (via currently out-of-tree > > > code), implicit shared<->private conversions will always result in > > > a PSMASH being attempted, even if it's not actually needed to > > > resolve the RMP fault. This doesn't cause any harm, but results in a > > > needless PSMASH and zapping of the sPTE > > > > > > Resolve these issues by calling sev_handle_rmp_fault() only when > > > kvm_mmu_page_fault()'s return code is greater than or equal to 0, > > > indicating a KVM_MEMORY_EXIT_FAULT/-EFAULT isn't needed. While here, > > > simplify the code slightly and fix up the associated comments for better > > > clarity. > > > > > > Fixes: ccc9d836c5c3 ("KVM: SEV: Add support to handle RMP nested page faults") > > > > > > Signed-off-by: Michael Roth <michael.roth@xxxxxxx> > > > --- > > > arch/x86/kvm/svm/svm.c | 10 ++++------ > > > 1 file changed, 4 insertions(+), 6 deletions(-) > > > > > > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c > > > index 426ad49325d7..9431ce74c7d4 100644 > > > --- a/arch/x86/kvm/svm/svm.c > > > +++ b/arch/x86/kvm/svm/svm.c > > > @@ -2070,14 +2070,12 @@ static int npf_interception(struct kvm_vcpu *vcpu) > > > svm->vmcb->control.insn_len); > > > /* > > > - * rc == 0 indicates a userspace exit is needed to handle page > > > - * transitions, so do that first before updating the RMP table. > > > + * rc < 0 indicates a userspace exit may be needed to handle page > > > + * attribute updates, so deal with that first before handling other > > > + * potential RMP fault conditions. > > > */ > > > - if (error_code & PFERR_GUEST_RMP_MASK) { > > > - if (rc == 0) > > > - return rc; > > > + if (rc >= 0 && error_code & PFERR_GUEST_RMP_MASK) > > > > This isn't correct either. A return of '0' also indiciates "exit to userspace", > > it just doesn't happen with SNP because '0' is returned only when KVM attempts > > emulation, and that too gets short-circuited by svm_check_emulate_instruction(). > > > > And I would honestly drop the comment, KVM's less-than-pleasant 1/0/-errno return > > values overload is ubiquitous enough that it should be relatively self-explanatory. > > > > Or if you prefer to keep a comment, drop the part that specifically calls out > > attributes updates, because that incorrectly implies that's the _only_ reason > > why KVM checks the return. But my vote is to drop the comment, because it > > essentially becomes "don't proceed to step 2 if step 1 failed", which kind of > > makes the reader go "well, yeah". > > So IIUC you're suggesting > > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c > index 426ad49325d7..c39eaeb21981 100644 > --- a/arch/x86/kvm/svm/svm.c > +++ b/arch/x86/kvm/svm/svm.c > @@ -2068,16 +2068,11 @@ static int npf_interception(struct kvm_vcpu *vcpu) > static_cpu_has(X86_FEATURE_DECODEASSISTS) ? > svm->vmcb->control.insn_bytes : NULL, > svm->vmcb->control.insn_len); > + if (rc <= 0) > + return rc; > - /* > - * rc == 0 indicates a userspace exit is needed to handle page > - * transitions, so do that first before updating the RMP table. > - */ > - if (error_code & PFERR_GUEST_RMP_MASK) { > - if (rc == 0) > - return rc; > + if (error_code & PFERR_GUEST_RMP_MASK) > sev_handle_rmp_fault(vcpu, fault_address, error_code); > - } > return rc; > } > > ? > > So, we're... a bit tight for 6.10 to include SNP and that is an > understatement. My plan is to merge it for 6.11, but do so > immediately after the merge window ends. In other words, it > is a delay in terms of release but not in terms of time. I > don't want QEMU and kvm-unit-tests work to be delayed any > further, in particular. That's unfortunate, I'd thought from the PUCK call that we still had some time to stabilize things before merge window. But whatever you think is best. > > Once we sort out the loose ends of patches 21-23, you could send > it as a pull request. Ok, as a pull request against kvm/next, or kvm/queue? Thanks, Mike > > Paolo > >