On Wed, Jul 07, 2021, Brijesh Singh wrote: > The integrity guarantee of SEV-SNP is enforced through the RMP table. > The RMP is used in conjuntion with standard x86 and IOMMU page > tables to enforce memory restrictions and page access rights. The > RMP is indexed by system physical address, and is checked at the end > of CPU and IOMMU table walks. The RMP check is enforced as soon as > SEV-SNP is enabled globally in the system. Not every memory access > requires an RMP check. In particular, the read accesses from the > hypervisor do not require RMP checks because the data confidentiality > is already protected via memory encryption. When hardware encounters > an RMP checks failure, it raise a page-fault exception. The RMP bit in > fault error code can be used to determine if the fault was due to an > RMP checks failure. > > A write from the hypervisor goes through the RMP checks. When the > hypervisor writes to pages, hardware checks to ensures that the assigned > bit in the RMP is zero (i.e page is shared). If the page table entry that > gives the sPA indicates that the target page size is a large page, then > all RMP entries for the 4KB constituting pages of the target must have the > assigned bit 0. If one of entry does not have assigned bit 0 then hardware > will raise an RMP violation. To resolve it, split the page table entry > leading to target page into 4K. Isn't the above just saying: All RMP entries covered by a large page must match the shared vs. encrypted state of the page, e.g. host large pages must have assigned=0 for all relevant RMP entries. > This poses a challenge in the Linux memory model. The Linux kernel > creates a direct mapping of all the physical memory -- referred to as > the physmap. The physmap may contain a valid mapping of guest owned pages. > During the page table walk, the host access may get into the situation > where one of the pages within the large page is owned by the guest (i.e > assigned bit is set in RMP). A write to a non-guest within the large page > will raise an RMP violation. Call set_memory_4k() to split the physmap > before adding the page in the RMP table. This ensures that the pages > added in the RMP table are used as 4K in the physmap. > > Signed-off-by: Brijesh Singh <brijesh.singh@xxxxxxx> > --- > arch/x86/kernel/sev.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c > index 949efe530319..a482e01f880a 100644 > --- a/arch/x86/kernel/sev.c > +++ b/arch/x86/kernel/sev.c > @@ -2375,6 +2375,12 @@ int rmpupdate(struct page *page, struct rmpupdate *val) > if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP)) > return -ENXIO; > > + ret = set_memory_4k((unsigned long)page_to_virt(page), 1); IIUC, this shatters the direct map for page that's assigned to an SNP guest, and the large pages are never recovered? I believe a better approach would be to do something similar to memfd_secret[*], which encountered a similar problem with the direct map. Instead of forcing the direct map to be forever 4k, unmap the direct map when making a page guest private, and restore the direct map when it's made shared (or freed). I thought memfd_secret had also solved the problem of restoring large pages in the direct map, but at a glance I can't tell if that's actually implemented anywhere. But, even if it's not currently implemented, I think it makes sense to mimic the memfd_secret approach so that both features can benefit if large page preservation/restoration is ever added. [*] https://lkml.kernel.org/r/20210518072034.31572-5-rppt@xxxxxxxxxx > + if (ret) { > + pr_err("Failed to split physical address 0x%lx (%d)\n", spa, ret); > + return ret; > + } > + > /* Retry if another processor is modifying the RMP entry. */ > do { > /* Binutils version 2.36 supports the RMPUPDATE mnemonic. */ > -- > 2.17.1 >