On Tue, Apr 20, 2021, Kirill A. Shutemov wrote: > On Mon, Apr 19, 2021 at 08:09:13PM +0000, Sean Christopherson wrote: > > On Mon, Apr 19, 2021, Kirill A. Shutemov wrote: > > > The critical question is whether we ever need to translate hva->pfn after > > > the page is added to the guest private memory. I believe we do, but I > > > never checked. And that's the reason we need to keep hwpoison entries > > > around, which encode pfn. > > > > As proposed in the TDX RFC, KVM would "need" the hva->pfn translation if the > > guest private EPT entry was zapped, e.g. by NUMA balancing (which will fail on > > the backend). But in that case, KVM still has the original PFN, the "new" > > translation becomes a sanity check to make sure that the zapped translation > > wasn't moved unexpectedly. > > > > Regardless, I don't see what that has to do with kvm_pfn_map. At some point, > > gup() has to fault in the page or look at the host PTE value. For the latter, > > at least on x86, we can throw info into the PTE itself to tag it as guest-only. > > No matter what implementation we settle on, I think we've failed if we end up in > > a situation where the primary MMU has pages it doesn't know are guest-only. > > I try to understand if it's a problem if KVM sees a guest-only PTE, but > it's for other VM. Like two VM's try to use the same tmpfs file as guest > memory. We cannot insert the pfn into two TD/SEV guest at once, but can it > cause other problems? I'm not sure. For TDX and SNP, "firmware" will prevent assigning the same PFN to multiple VMs. For SEV and SEV-ES, the PSP (what I'm calling "firmware") will not prevent assigning the same page to multiple guests. But the failure mode in that case, assuming the guests have different ASIDs, is limited to corruption of the guest. On the other hand, for SEV/SEV-ES it's not invalid to assign the same ASID to multiple guests (there's an in-flight patch to do exactly that[*]), and sharing PFNs between guests with the same ASID would also be valid. In other words, if we want to enforce PFN association in the kernel, I think the association should be per-ASID, not per-KVM guest. So, I don't think we _need_ to rely on the TDX/SNP behavior, but if leveraging firmware to handle those checks means avoiding additional complexity in the kernel, then I think it's worth leaning on firmware even if it means SEV/SEV-ES don't enjoy the same level of robustness. [*] https://lkml.kernel.org/r/20210408223214.2582277-1-natet@xxxxxxxxxx