From: Tom Lendacky <thomas.lendacky@xxxxxxx> Sent: Thursday, June 1, 2023 11:19 AM > > On 5/31/23 15:00, Michael Kelley (LINUX) wrote: > > From: Sathyanarayanan Kuppuswamy > <sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx> > > Sent: Tuesday, May 30, 2023 6:22 AM > >> > >> Hi, > >> > >> On 5/30/23 5:57 AM, Tom Lendacky wrote: > >>> On 5/29/23 19:57, Kirill A. Shutemov wrote: > >>>> On Fri, May 26, 2023 at 03:10:56PM -0700, Sathyanarayanan Kuppuswamy > wrote: > >>>>> > >>>>> > >>>>> On 5/26/23 5:02 AM, Kirill A. Shutemov wrote: > >>>>>> Touching privately mapped GPA that is not properly converted to private > >>>>>> with MapGPA and accepted leads to unrecoverable exit to VMM. > >>>>>> > >>>>>> load_unaligned_zeropad() can touch memory that is not owned by the > >>>>>> caller, but just happened to next after the owned memory. > >>>>> > >>>>> /s/to/to be ? > >>>> > >>>> Yep, my bad. > >>>> > >>>>>> This load_unaligned_zeropad() behaviour makes it important when kernel > >>>>>> asks VMM to convert a GPA from shared to private or back. Kernel must > >>>>>> never have a page mapped into direct mapping (and aliases) as private > >>>>>> when the GPA is already converted to shared or when GPA is not yet > >>>>>> converted to private. > >>>>> > >>>>> I am wondering whether this issue exist in the AMD code? > >>>>> > >>>>> IMO, you can add some info on the window in set_memory_encrypted() > >>>>> where this race exists. > >>>> > >>>> I don't think AMD affected by load_unaligned_zeropad() the same way as > >>>> Intel does. But I'm not sure. > >>>> > >>>> Tom, do you have any comments? > >>> > >>> Right, shouldn't be an issue for SNP. > >> > >> Thanks for confirming. > >> > > > > Tom -- For my education, could you elaborate on why this problem can't > > occur in an SEV-SNP guest? There's still a window where the direct map > > PTE and the RMP as maintained by the hypervisor are out-of-sync. If > > load_unaligned_zeropad() does a read using the direct map PTE during > > this out-of-sync window, isn't that going to trap to the hypervisor? How > > is the scenario is handled from there to provide the zeros to > > load_unaligned_zeropad()? I need to make sure Hyper-V is doing whatever > > is needed. :-) > > Ah, I think I misunderstood this when it was being talked about. The issue > SNP would have would be between setting the c-bit but before the PVALIDATE > is issued. Prior to the RMP being updated, referencing the page will > generate an #NPF and automatically change the RMP over to private (in > KVM). However, after the guest is resumed, the page will not have been > validated resulting in a #VC with error code 0x404 being generated, > causing the guest to terminate itself. > > I suppose, when a 0x404 error code is encountered by the #VC handler, it > could call search_exception_tables() and call ex_handler_zeropad() for the > EX_TYPE_ZEROPAD type (ex_handler_zeropad is currently static, though). > Tom -- Does the above sequence *depend* on the hypervisor doing anything to make it work? I'm not clear on why KVM would automatically change the page over to private. If there's a dependency on the hypervisor doing something, then it seems like we'll need to standardize that "something" across hypervisors, lest we end up with per-hypervisor code in Linux to handle this scenario. And running SEV-SNP with multiple VMPLs probably makes it even more complicated. Kirill -- Same question about TDX. Does making load_unaligned_zeropad() work in a TDX VM depend on the hypervisor doing anything? Or is the behavior seen by the guest dependent only on architected behavior of the TDX processor? Looking at this problem from a slightly higher level, and thinking out loud a bit, load_unaligned_zeropad() functionality is provided only for certain architectures: x86/64, arm, arm64, and PowerPC 64 (little endian). There are fallbacks for architectures that don't support it. With two minor tweaks to Kconfig files, I've built x86 with load_unaligned_zeropad() disabled. Maybe with today's processors the performance benefits are past their prime, and running with it disabled in CoCo VMs is the better solution. Does anyone have a sense of whether the perf impact would be measureable? If doing the load_unaligned_zeropad() enable/disable at build time is too limiting, maybe it could be runtime based on whether page private/shared state is being enforced. I haven't looked at the details. Thoughts? Michael