Re: [PATCH Part2 RFC v4 10/40] x86/fault: Add support to handle the RMP fault for user address

Brijesh Singh <brijesh.singh@xxxxxxx> · Mon, 12 Jul 2021 10:43:50 -0500




Hi Dave,


On 7/8/21 11:16 AM, Dave Hansen wrote:

"SIGBUG"?

Its typo, it should be SIGBUS

+
+	if (unlikely(!cpu_feature_enabled(X86_FEATURE_SEV_SNP)))
+		return RMP_FAULT_KILL;

Shouldn't this be a WARN_ON_ONCE()?  How can we get RMP faults without
SEV-SNP?

Yes, we should *not* get RMP fault if SEV-SNP is not enabled. I can use 
the WARN_ON_ONCE().



+	/* Get the native page level */
+	pte = lookup_address_in_mm(current->mm, address, &level);
+	if (unlikely(!pte))
+		return RMP_FAULT_KILL;

What would this mean?  There was an RMP fault on a non-present page?
How could that happen?  What if there was a race between an unmapping
event and the RMP fault delivery?

We should not have RMP fault for non-present pages. But you have a good 
point that there maybe a race between the unmap event and RMP fault. 
Instead of terminating the process we should simply retry.



+	pfn = pte_pfn(*pte);
+	if (level > PG_LEVEL_4K) {
+		mask = pages_per_hpage(level) - pages_per_hpage(level - 1);
+		pfn |= (address >> PAGE_SHIFT) & mask;
+	}

This looks inherently racy.  What happens if there are two parallel RMP
faults on the same 2M page.  One of them splits the page tables, the
other gets a fault for an already-split page table.
 > Is that handled here somehow?

Yes, in this particular case we simply retry and hardware should 
re-evaluate the page level and take the corrective action.



+	/* Get the page level from the RMP entry. */
+	e = snp_lookup_page_in_rmptable(pfn_to_page(pfn), &rmp_level);
+	if (!e)
+		return RMP_FAULT_KILL;

The snp_lookup_page_in_rmptable() failure cases looks WARN-worthly.
Either you're doing a lookup for something not *IN* the RMP table, or
you don't support SEV-SNP, in which case you shouldn't be in this code
in the first place.

Noted.


+	/*
+	 * Check if the RMP violation is due to the guest private page access.
+	 * We can not resolve this RMP fault, ask to kill the guest.
+	 */
+	if (rmpentry_assigned(e))
+		return RMP_FAULT_KILL;

No "We's", please.  Speak in imperative voice.

Noted.


+	/*
+	 * The backing page level is higher than the RMP page level, request
+	 * to split the page.
+	 */
+	if (level > rmp_level)
+		return RMP_FAULT_PAGE_SPLIT;

This can theoretically trigger on a hugetlbfs page.  Right?


Yes, theoretically.

In the current implementation, the VMM is enlightened to not use the 
hugetlbfs for backing page when creating the SEV-SNP guests.


I thought I asked about this before... more below...

+	return RMP_FAULT_RETRY;
+}
+
  /*
   * Handle faults in the user portion of the address space.  Nothing in here
   * should check X86_PF_USER without a specific justification: for almost
@@ -1298,6 +1350,7 @@ void do_user_addr_fault(struct pt_regs *regs,
  	struct task_struct *tsk;
  	struct mm_struct *mm;
  	vm_fault_t fault;
+	int ret;
  	unsigned int flags = FAULT_FLAG_DEFAULT;
  
  	tsk = current;
@@ -1378,6 +1431,22 @@ void
(struct pt_regs *regs,
  	if (error_code & X86_PF_INSTR)
  		flags |= FAULT_FLAG_INSTRUCTION;
  
+	/*
+	 * If its an RMP violation, try resolving it.
+	 */
+	if (error_code & X86_PF_RMP) {
+		ret = handle_user_rmp_page_fault(error_code, address);
+		if (ret == RMP_FAULT_PAGE_SPLIT) {
+			flags |= FAULT_FLAG_PAGE_SPLIT;
+		} else if (ret == RMP_FAULT_KILL) {
+			fault |= VM_FAULT_SIGBUS;
+			do_sigbus(regs, error_code, address, fault);
+			return;
+		} else {
+			return;
+		}
+	}

Why not just have handle_user_rmp_page_fault() return a VM_FAULT_* code
directly?


I don't have any strong reason against it. In next rev, I can update to 
use the VM_FAULT_* code and call the do_sigbus() etc.

I also suspect you can just set VM_FAULT_SIGBUS and let the do_sigbus()
call later on in the function do its work.
  
+static int handle_split_page_fault(struct vm_fault *vmf)
+{
+	if (!IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT))
+		return VM_FAULT_SIGBUS;
+
+	__split_huge_pmd(vmf->vma, vmf->pmd, vmf->address, false, NULL);
+	return 0;
+}

What will this do when you hand it a hugetlbfs page?


VMM is updated to not use the hugetlbfs when creating SEV-SNP guests. 
So, we should not run into it.

-Brijesh