Re: [PATCH Part2 v6 41/49] KVM: SVM: Add support to handle the RMP nested page fault

"Kalra, Ashish" <ashish.kalra@xxxxxxx> · Thu, 13 Oct 2022 10:00:49 -0500

On 10/12/2022 5:53 PM, Alper Gun wrote:
On Mon, Oct 10, 2022 at 7:32 PM Kalra, Ashish <ashish.kalra@xxxxxxx> wrote:

Hello Alper,

On 10/10/2022 5:03 PM, Alper Gun wrote:
On Mon, Jun 20, 2022 at 4:13 PM Ashish Kalra <Ashish.Kalra@xxxxxxx> wrote:

From: Brijesh Singh <brijesh.singh@xxxxxxx>

When SEV-SNP is enabled in the guest, the hardware places restrictions on
all memory accesses based on the contents of the RMP table. When hardware
encounters RMP check failure caused by the guest memory access it raises
the #NPF. The error code contains additional information on the access
type. See the APM volume 2 for additional information.

Signed-off-by: Brijesh Singh <brijesh.singh@xxxxxxx>
---
   arch/x86/kvm/svm/sev.c | 76 ++++++++++++++++++++++++++++++++++++++++++
   arch/x86/kvm/svm/svm.c | 14 +++++---
   2 files changed, 86 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 4ed90331bca0..7fc0fad87054 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -4009,3 +4009,79 @@ void sev_post_unmap_gfn(struct kvm *kvm, gfn_t gfn, kvm_pfn_t pfn)

          spin_unlock(&sev->psc_lock);
   }
+
+void handle_rmp_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code)
+{
+       int rmp_level, npt_level, rc, assigned;
+       struct kvm *kvm = vcpu->kvm;
+       gfn_t gfn = gpa_to_gfn(gpa);
+       bool need_psc = false;
+       enum psc_op psc_op;
+       kvm_pfn_t pfn;
+       bool private;
+
+       write_lock(&kvm->mmu_lock);
+
+       if (unlikely(!kvm_mmu_get_tdp_walk(vcpu, gpa, &pfn, &npt_level)))
+               goto unlock;
+
+       assigned = snp_lookup_rmpentry(pfn, &rmp_level);
+       if (unlikely(assigned < 0))
+               goto unlock;
+
+       private = !!(error_code & PFERR_GUEST_ENC_MASK);
+
+       /*
+        * If the fault was due to size mismatch, or NPT and RMP page level's
+        * are not in sync, then use PSMASH to split the RMP entry into 4K.
+        */
+       if ((error_code & PFERR_GUEST_SIZEM_MASK) ||
+           (npt_level == PG_LEVEL_4K && rmp_level == PG_LEVEL_2M && private)) {
+               rc = snp_rmptable_psmash(kvm, pfn);


Regarding this case:
RMP level is 4K
Page table level is 2M

Does this also cause a page fault with size mismatch? If so, we
shouldn't try psmash because the rmp entry is already 4K.

I see these errors in our tests and I think it may be happening
because rmp size is already 4K.

[ 1848.752952] psmash failed, gpa 0x191560000 pfn 0x536cd60 rc 7
[ 2922.879635] psmash failed, gpa 0x102830000 pfn 0x37c8230 rc 7
[ 3010.983090] psmash failed, gpa 0x104220000 pfn 0x6cf1e20 rc 7
[ 3170.792050] psmash failed, gpa 0x108a80000 pfn 0x20e0080 rc 7
[ 3345.955147] psmash failed, gpa 0x11b480000 pfn 0x1545e480 rc 7

Shouldn't we use AND instead of OR in the if statement?


I believe this we can't do, looking at the typical usage case below :

[   37.243969] #VMEXIT (NPF) - SIZEM, err 0xc80000005 npt_level 2,
rmp_level 2, private 1
[   37.243973] trying psmash gpa 0x7f790000 pfn 0x1f5d90

This is typically the case with #VMEXIT(NPF) with SIZEM error code, when
the guest tries to do PVALIDATE on 4K GHCB pages, in this case both the
RMP table and NPT will be optimally setup to 2M hugepage as can be seen.

Is it possible to investigate in more depth, when is the this case being
observed:

Yes, I added more logs and I can see that these errors happen when RMP
level is 4K and NPT level is 2M.
psmash fails as expected. I think it is just a log, there is no real
issue but the best is not trying psmash if rmp level is 4K.


Now, the SIZEM bit is only set when PVALIDATE or RMPADJUST fails due to
guest attempting to validate a 4K page that is backed by a 2MB RMP 
entry, which is not the case here as RMP level is 4K.

Also, this does not fall into the second case for the same reason.

#NPF will happen during Guest page table walk if RMP checks fail
for 2M nested page and RMP.SubPage_Count !=0 OR
RMP.PageSize != Nested table page size, but then that shouldn't have
the SIZEM fault bit set.

This raises concern about some existing race condition, it probably
can race with
snp_handle_page_state_change()->snp_make_page_shared()->snp_rmptable_psmash(),
but that code path seems to be protected from this nested RMP #PF 
handler as they both acquire the kvm->mmu_lock.

So, this still needs more investigation.

Can you share what kind of tests are you running to reproduce this
issue ?

Thanks,
Ashish

RMP level is 4K
Page table level is 2M
We shouldn't try psmash because the rmp entry is already 4K.

Thanks,
Ashish

if ((error_code & PFERR_GUEST_SIZEM_MASK) && ...