On 1/8/2024 11:13 AM, Michael Kelley wrote: > From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx> > Sent: Monday, January 8, 2024 10:37 AM >> >> On 1/5/2024 10:30 AM, mhkelley58@xxxxxxxxx wrote: >>> From: Michael Kelley <mhklinux@xxxxxxxxxxx> >>> >>> In a CoCo VM, when transitioning memory from encrypted to decrypted, or >>> vice versa, the caller of set_memory_encrypted() or set_memory_decrypted() >>> is responsible for ensuring the memory isn't in use and isn't referenced >>> while the transition is in progress. The transition has multiple steps, >>> and the memory is in an inconsistent state until all steps are complete. >>> A reference while the state is inconsistent could result in an exception >>> that can't be cleanly fixed up. >>> >>> However, the kernel load_unaligned_zeropad() mechanism could cause a stray >>> reference that can't be prevented by the caller of set_memory_encrypted() >>> or set_memory_decrypted(), so there's specific code to handle this case. >>> But a CoCo VM running on Hyper-V may be configured to run with a paravisor, >>> with the #VC or #VE exception routed to the paravisor. There's no >>> architectural way to forward the exceptions back to the guest kernel, and >>> in such a case, the load_unaligned_zeropad() specific code doesn't work. >>> >>> To avoid this problem, mark pages as "not present" while a transition >>> is in progress. If load_unaligned_zeropad() causes a stray reference, a >>> normal page fault is generated instead of #VC or #VE, and the >>> page-fault-based fixup handlers for load_unaligned_zeropad() resolve the >>> reference. When the encrypted/decrypted transition is complete, mark the >>> pages as "present" again. >> >> Change looks good to me. But I am wondering why are adding it part of >> prepare and finish callbacks instead of directly in set_memory_encrypted() function. >> > > The prepare/finish callbacks are different for TDX, SEV-SNP, and > Hyper-V CoCo guests running with a paravisor -- so there are three sets > of callbacks. As described in the cover letter, I've given up on using this > scheme for the TDX and SEV-SNP cases, because of the difficulty with > the SEV-SNP callbacks needing a valid virtual address (whereas TDX and > Hyper-V paravisor need only a physical address). So it seems like the > callbacks specific to the Hyper-V paravisor are the natural place for the > code. That leaves the TDX and SEV-SNP code paths unchanged, which > was my intent. > Got it. Thanks for clarifying it. > Or maybe I'm not understanding your comment? If that's the case, > please elaborate. > > Michael > >> Reviewed-by: Kuppuswamy Sathyanarayanan >> <sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx> >> -- Sathyanarayanan Kuppuswamy Linux Kernel Developer