Elliot Berman <quic_eberman@xxxxxxxxxxx> writes: > On Tue, Sep 10, 2024 at 11:44:01PM +0000, Ackerley Tng wrote: >> Since guest_memfd now supports mmap(), folios have to be prepared >> before they are faulted into userspace. >> >> When memory attributes are switched between shared and private, the >> up-to-date flags will be cleared. >> >> Use the folio's up-to-date flag to indicate being ready for the guest >> usage and can be used to mark whether the folio is ready for shared OR >> private use. > > Clearing the up-to-date flag also means that the page gets zero'd out > whenever it transitions between shared and private (either direction). > pKVM (Android) hypervisor policy can allow in-place conversion between > shared/private. > > I believe the important thing is that sev_gmem_prepare() needs to be > called prior to giving page to guest. In my series, I had made a > ->prepare_inaccessible() callback where KVM would only do this part. > When transitioning to inaccessible, only that callback would be made, > besides the bookkeeping. The folio zeroing happens once when allocating > the folio if the folio is initially accessible (faultable). > > From x86 CoCo perspective, I think it also makes sense to not zero > the folio when changing faultiblity from private to shared: > - If guest is sharing some data with host, you've wiped the data and > guest has to copy again. > - Or, if SEV/TDX enforces that page is zero'd between transitions, > Linux has duplicated the work that trusted entity has already done. > > Fuad and I can help add some details for the conversion. Hopefully we > can figure out some of the plan at plumbers this week. Zeroing the page prevents leaking host data (see function docstring for kvm_gmem_prepare_folio() introduced in [1]), so we definitely don't want to introduce a kernel data leak bug here. In-place conversion does require preservation of data, so for conversions, shall we zero depending on VM type? + Gunyah: don't zero since ->prepare_inaccessible() is a no-op + pKVM: don't zero + TDX: don't zero + SEV: AMD Architecture Programmers Manual 7.10.6 says there is no automatic encryption and implies no zeroing, hence perform zeroing + KVM_X86_SW_PROTECTED_VM: Doesn't have a formal definition so I guess we could require zeroing on transition? This way, the uptodate flag means that it has been prepared (as in sev_gmem_prepare()), and zeroed if required by VM type. Regarding flushing the dcache/tlb in your other question [2], if we don't use folio_zero_user(), can we relying on unmapping within core-mm to flush after shared use, and unmapping within KVM To flush after private use? Or should flush_dcache_folio() be explicitly called on kvm_gmem_fault()? clear_highpage(), used in the non-hugetlb (original) path, doesn't flush the dcache. Was that intended? > Thanks, > Elliot > >> >> <snip> [1] https://lore.kernel.org/all/20240726185157.72821-8-pbonzini@xxxxxxxxxx/ [2] https://lore.kernel.org/all/diqz34ldszp3.fsf@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/