On 3/29/24 10:24 PM, Michael Roth wrote: > truncate_inode_pages_range() may attempt to zero pages before truncating > them, and this will occur before arch-specific invalidations can be > triggered via .invalidate_folio/.free_folio hooks via kvm_gmem_aops. For > AMD SEV-SNP this would result in an RMP #PF being generated by the > hardware, which is currently treated as fatal (and even if specifically > allowed for, would not result in anything other than garbage being > written to guest pages due to encryption). On Intel TDX this would also > result in undesirable behavior. > > Set the AS_INACCESSIBLE flag to prevent the MM from attempting > unexpected accesses of this sort during operations like truncation. > > This may also in some cases yield a decent performance improvement for > guest_memfd userspace implementations that hole-punch ranges immediately > after private->shared conversions via KVM_SET_MEMORY_ATTRIBUTES, since > the current implementation of truncate_inode_pages_range() always ends > up zero'ing an entire 4K range if it is backing by a 2M folio. > > Link: https://lore.kernel.org/lkml/ZR9LYhpxTaTk6PJX@xxxxxxxxxx/ > Suggested-by: Sean Christopherson <seanjc@xxxxxxxxxx> > Signed-off-by: Michael Roth <michael.roth@xxxxxxx> Acked-by: Vlastimil Babka <vbabka@xxxxxxx> > --- > virt/kvm/guest_memfd.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c > index 4ce0056d1149..3668a5f1d82b 100644 > --- a/virt/kvm/guest_memfd.c > +++ b/virt/kvm/guest_memfd.c > @@ -428,6 +428,7 @@ static int __kvm_gmem_create(struct kvm *kvm, loff_t size, u64 flags) > inode->i_private = (void *)(unsigned long)flags; > inode->i_op = &kvm_gmem_iops; > inode->i_mapping->a_ops = &kvm_gmem_aops; > + inode->i_mapping->flags |= AS_INACCESSIBLE; > inode->i_mode |= S_IFREG; > inode->i_size = size; > mapping_set_gfp_mask(inode->i_mapping, GFP_HIGHUSER);