On 1/29/24 12:59, Borislav Petkov wrote: > On Sat, Jan 27, 2024 at 05:02:49PM +0100, Borislav Petkov wrote: >> This function takes any PFN it gets passed in as it is. I don't care >> who its users are now or in the future and whether they pay attention >> what they pass into - it needs to be properly defined. > > Ok, we solved it offlist, here's the final version I have. It has > a comment explaining what I was asking. > > --- > From: Michael Roth <michael.roth@xxxxxxx> > Date: Thu, 25 Jan 2024 22:11:11 -0600 > Subject: [PATCH] x86/sev: Adjust the directmap to avoid inadvertent RMP faults > > If the kernel uses a 2MB or larger directmap mapping to write to an > address, and that mapping contains any 4KB pages that are set to private > in the RMP table, an RMP #PF will trigger and cause a host crash. > > SNP-aware code that owns the private PFNs will never attempt such > a write, but other kernel tasks writing to other PFNs in the range may > trigger these checks inadvertently due to writing to those other PFNs > via a large directmap mapping that happens to also map a private PFN. > > Prevent this by splitting any 2MB+ mappings that might end up containing > a mix of private/shared PFNs as a result of a subsequent RMPUPDATE for > the PFN/rmp_level passed in. > > Another way to handle this would be to limit the directmap to 4K > mappings in the case of hosts that support SNP, but there is potential > risk for performance regressions of certain host workloads. > > Handling it as-needed results in the directmap being slowly split over > time, which lessens the risk of a performance regression since the more > the directmap gets split as a result of running SNP guests, the more > likely the host is being used primarily to run SNP guests, where > a mostly-split directmap is actually beneficial since there is less > chance of TLB flushing and cpa_lock contention being needed to perform > these splits. > > Cases where a host knows in advance it wants to primarily run SNP guests > and wishes to pre-split the directmap can be handled by adding > a tuneable in the future, but preliminary testing has shown this to not > provide a signficant benefit in the common case of guests that are > backed primarily by 2MB THPs, so it does not seem to be warranted > currently and can be added later if a need arises in the future. > > Signed-off-by: Michael Roth <michael.roth@xxxxxxx> > Signed-off-by: Borislav Petkov (AMD) <bp@xxxxxxxxx> > Link: https://lore.kernel.org/r/20240126041126.1927228-12-michael.roth@xxxxxxx Acked-by: Vlastimil Babka <vbabka@xxxxxxx> Thanks!