Hi Peter, On Wed, 12 Feb 2025 at 18:19, Peter Xu <peterx@xxxxxxxxxx> wrote: > > On Tue, Feb 11, 2025 at 12:11:18PM +0000, Fuad Tabba wrote: > > Before transitioning a guest_memfd folio to unshared, thereby > > disallowing access by the host and allowing the hypervisor to > > transition its view of the guest page as private, we need to be > > sure that the host doesn't have any references to the folio. > > > > This patch introduces a new type for guest_memfd folios, which > > isn't activated in this series but is here as a placeholder and > > to facilitate the code in the next patch. This will be used in > > the future to register a callback that informs the guest_memfd > > subsystem when the last reference is dropped, therefore knowing > > that the host doesn't have any remaining references. > > > > Signed-off-by: Fuad Tabba <tabba@xxxxxxxxxx> > > --- > > include/linux/kvm_host.h | 9 +++++++++ > > include/linux/page-flags.h | 17 +++++++++++++++++ > > mm/debug.c | 1 + > > mm/swap.c | 9 +++++++++ > > virt/kvm/guest_memfd.c | 7 +++++++ > > 5 files changed, 43 insertions(+) > > > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > > index f34f4cfaa513..8b5f28f6efff 100644 > > --- a/include/linux/kvm_host.h > > +++ b/include/linux/kvm_host.h > > @@ -2571,4 +2571,13 @@ long kvm_arch_vcpu_pre_fault_memory(struct kvm_vcpu *vcpu, > > struct kvm_pre_fault_memory *range); > > #endif > > > > +#ifdef CONFIG_KVM_GMEM_SHARED_MEM > > +void kvm_gmem_handle_folio_put(struct folio *folio); > > +#else > > +static inline void kvm_gmem_handle_folio_put(struct folio *folio) > > +{ > > + WARN_ON_ONCE(1); > > +} > > +#endif > > + > > #endif > > diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h > > index 6dc2494bd002..734afda268ab 100644 > > --- a/include/linux/page-flags.h > > +++ b/include/linux/page-flags.h > > @@ -933,6 +933,17 @@ enum pagetype { > > PGTY_slab = 0xf5, > > PGTY_zsmalloc = 0xf6, > > PGTY_unaccepted = 0xf7, > > + /* > > + * guestmem folios are used to back VM memory as managed by guest_memfd. > > + * Once the last reference is put, instead of freeing these folios back > > + * to the page allocator, they are returned to guest_memfd. > > + * > > + * For now, guestmem will only be set on these folios as long as they > > + * cannot be mapped to user space ("private state"), with the plan of > > + * always setting that type once typed folios can be mapped to user > > + * space cleanly. > > Does it imply that gmem folios can be mapped to userspace at some point? > It'll be great if you can share more about it, since as of now it looks > like anything has a page type cannot use the per-page mapcount. This is the goal of this series. By the end of this series, you can map gmem folios, as long as they belong to a VM type that allows it. My other series, which will be rebased on this one, adds the distinction of memory shared with the host vs memory private to the guest: https://lore.kernel.org/all/20250117163001.2326672-1-tabba@xxxxxxxxxx/ That series deals with the mapcount issue, by only applying the type once the mapcount is 0. We talked about this in the guest_memfd mm sync, David Hildenbrand mentioned ongoing work to remove the overlaying of the type with the memcount. That should solve the problem completely. > When looking at this, I also found that __folio_rmap_sanity_checks() has > some folio_test_hugetlb() tests, not sure whether they're prone to be > changed too e.g. to cover all pages that have a type, so as to cover gmem. > > For the longer term, it'll be definitely nice if gmem folios can be > mapcounted just like normal file folios. It can enable gmem as a backstore > just like what normal memfd would do, with gmem managing the folios. That's the plan, I agree. > > + */ > > + PGTY_guestmem = 0xf8, > > > > PGTY_mapcount_underflow = 0xff > > }; > > @@ -1082,6 +1093,12 @@ FOLIO_TYPE_OPS(hugetlb, hugetlb) > > FOLIO_TEST_FLAG_FALSE(hugetlb) > > #endif > > > > +#ifdef CONFIG_KVM_GMEM_SHARED_MEM > > This seems to only be defined in follow up patches.. so may need some > adjustments. It's a configuration option. If you like, I could bring forward the patch that adds it to the kconfig file. Thank you, /fuad > > +FOLIO_TYPE_OPS(guestmem, guestmem) > > +#else > > +FOLIO_TEST_FLAG_FALSE(guestmem) > > +#endif > > + > > PAGE_TYPE_OPS(Zsmalloc, zsmalloc, zsmalloc) > > > > /* > > diff --git a/mm/debug.c b/mm/debug.c > > index 8d2acf432385..08bc42c6cba8 100644 > > --- a/mm/debug.c > > +++ b/mm/debug.c > > @@ -56,6 +56,7 @@ static const char *page_type_names[] = { > > DEF_PAGETYPE_NAME(table), > > DEF_PAGETYPE_NAME(buddy), > > DEF_PAGETYPE_NAME(unaccepted), > > + DEF_PAGETYPE_NAME(guestmem), > > }; > > > > static const char *page_type_name(unsigned int page_type) > > diff --git a/mm/swap.c b/mm/swap.c > > index 47bc1bb919cc..241880a46358 100644 > > --- a/mm/swap.c > > +++ b/mm/swap.c > > @@ -38,6 +38,10 @@ > > #include <linux/local_lock.h> > > #include <linux/buffer_head.h> > > > > +#ifdef CONFIG_KVM_GMEM_SHARED_MEM > > +#include <linux/kvm_host.h> > > +#endif > > + > > #include "internal.h" > > > > #define CREATE_TRACE_POINTS > > @@ -101,6 +105,11 @@ static void free_typed_folio(struct folio *folio) > > case PGTY_hugetlb: > > free_huge_folio(folio); > > return; > > +#endif > > +#ifdef CONFIG_KVM_GMEM_SHARED_MEM > > + case PGTY_guestmem: > > + kvm_gmem_handle_folio_put(folio); > > + return; > > #endif > > default: > > WARN_ON_ONCE(1); > > diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c > > index b2aa6bf24d3a..c6f6792bec2a 100644 > > --- a/virt/kvm/guest_memfd.c > > +++ b/virt/kvm/guest_memfd.c > > @@ -312,6 +312,13 @@ static pgoff_t kvm_gmem_get_index(struct kvm_memory_slot *slot, gfn_t gfn) > > return gfn - slot->base_gfn + slot->gmem.pgoff; > > } > > > > +#ifdef CONFIG_KVM_GMEM_SHARED_MEM > > +void kvm_gmem_handle_folio_put(struct folio *folio) > > +{ > > + WARN_ONCE(1, "A placeholder that shouldn't trigger. Work in progress."); > > +} > > +#endif /* CONFIG_KVM_GMEM_SHARED_MEM */ > > + > > static struct file_operations kvm_gmem_fops = { > > .open = generic_file_open, > > .release = kvm_gmem_release, > > -- > > 2.48.1.502.g6dc24dfdaf-goog > > > > > > -- > Peter Xu >