On Mon, Nov 14, 2022 at 03:02:37PM +0100, Vlastimil Babka wrote: > On 11/1/22 16:19, Michael Roth wrote: > > On Tue, Nov 01, 2022 at 07:37:29PM +0800, Chao Peng wrote: > >> > > >> > 1) restoring kernel directmap: > >> > > >> > Currently SNP (and I believe TDX) need to either split or remove kernel > >> > direct mappings for restricted PFNs, since there is no guarantee that > >> > other PFNs within a 2MB range won't be used for non-restricted > >> > (which will cause an RMP #PF in the case of SNP since the 2MB > >> > mapping overlaps with guest-owned pages) > >> > >> Has the splitting and restoring been a well-discussed direction? I'm > >> just curious whether there is other options to solve this issue. > > > > For SNP it's been discussed for quite some time, and either splitting or > > removing private entries from directmap are the well-discussed way I'm > > aware of to avoid RMP violations due to some other kernel process using > > a 2MB mapping to access shared memory if there are private pages that > > happen to be within that range. > > > > In both cases the issue of how to restore directmap as 2M becomes a > > problem. > > > > I was also under the impression TDX had similar requirements. If so, > > do you know what the plan is for handling this for TDX? > > > > There are also 2 potential alternatives I'm aware of, but these haven't > > been discussed in much detail AFAIK: > > > > a) Ensure confidential guests are backed by 2MB pages. shmem has a way to > > request 2MB THP pages, but I'm not sure how reliably we can guarantee > > that enough THPs are available, so if we went that route we'd probably > > be better off requiring the use of hugetlbfs as the backing store. But > > obviously that's a bit limiting and it would be nice to have the option > > of using normal pages as well. One nice thing with invalidation > > scheme proposed here is that this would "Just Work" if implement > > hugetlbfs support, so an admin that doesn't want any directmap > > splitting has this option available, otherwise it's done as a > > best-effort. > > > > b) Implement general support for restoring directmap as 2M even when > > subpages might be in use by other kernel threads. This would be the > > most flexible approach since it requires no special handling during > > invalidations, but I think it's only possible if all the CPA > > attributes for the 2M range are the same at the time the mapping is > > restored/unsplit, so some potential locking issues there and still > > chance for splitting directmap over time. > > I've been hoping that > > c) using a mechanism such as [1] [2] where the goal is to group together > these small allocations that need to increase directmap granularity so > maximum number of large mappings are preserved. As I mentioned in the other thread the restricted memfd can be backed by secretmem instead of plain memfd. It already handles directmap with care. But I don't think it has to be part of initial restricted memfd implementation. It is SEV-specific requirement and AMD folks can extend implementation as needed later. -- Kiryl Shutsemau / Kirill A. Shutemov