Re: [PATCH v5 2/9] KVM: guest_memfd: Handle final folio_put() of guest_memfd pages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Ackerley,

On Fri, 7 Mar 2025 at 17:04, Ackerley Tng <ackerleytng@xxxxxxxxxx> wrote:
>
> Fuad Tabba <tabba@xxxxxxxxxx> writes:
>
> > Before transitioning a guest_memfd folio to unshared, thereby
> > disallowing access by the host and allowing the hypervisor to
> > transition its view of the guest page as private, we need to be
> > sure that the host doesn't have any references to the folio.
> >
> > This patch introduces a new type for guest_memfd folios, which
> > isn't activated in this series but is here as a placeholder and
> > to facilitate the code in the subsequent patch series. This will
> > be used in the future to register a callback that informs the
> > guest_memfd subsystem when the last reference is dropped,
> > therefore knowing that the host doesn't have any remaining
> > references.
> >
> > This patch also introduces the configuration option,
> > KVM_GMEM_SHARED_MEM, which toggles support for mapping
> > guest_memfd shared memory at the host.
> >
> > Signed-off-by: Fuad Tabba <tabba@xxxxxxxxxx>
> > Acked-by: Vlastimil Babka <vbabka@xxxxxxx>
> > Acked-by: David Hildenbrand <david@xxxxxxxxxx>
> > ---
> >  include/linux/kvm_host.h   |  7 +++++++
> >  include/linux/page-flags.h | 16 ++++++++++++++++
> >  mm/debug.c                 |  1 +
> >  mm/swap.c                  |  9 +++++++++
> >  virt/kvm/Kconfig           |  5 +++++
> >  5 files changed, 38 insertions(+)
> >
> > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> > index f34f4cfaa513..7788e3625f6d 100644
> > --- a/include/linux/kvm_host.h
> > +++ b/include/linux/kvm_host.h
> > @@ -2571,4 +2571,11 @@ long kvm_arch_vcpu_pre_fault_memory(struct kvm_vcpu *vcpu,
> >                                   struct kvm_pre_fault_memory *range);
> >  #endif
> >
> > +#ifdef CONFIG_KVM_GMEM_SHARED_MEM
> > +static inline void kvm_gmem_handle_folio_put(struct folio *folio)
> > +{
> > +     WARN_ONCE(1, "A placeholder that shouldn't trigger. Work in progress.");
> > +}
> > +#endif
> > +
> >  #endif
>
> Following up with the discussion at the guest_memfd biweekly call on the
> guestmem library, I think this folio_put() handler for guest_memfd could
> be the first function that's refactored out into (placeholder name)
> mm/guestmem.c.
>
> This folio_put() handler has to stay in memory even after KVM (as a
> module) is unloaded from memory, and so it is a good candidate for the
> first function in the guestmem library.
>
> Along those lines, CONFIG_KVM_GMEM_SHARED_MEM in this patch can be
> renamed CONFIG_GUESTMEM, and CONFIG_GUESTMEM will guard the existence of
> PGTY_guestmem.
>
> CONFIG_KVM_GMEM_SHARED_MEM can be introduced in the next patch of this
> series, which could, in Kconfig, select CONFIG_GUESTMEM.
>
> > diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
> > index 6dc2494bd002..daeee9a38e4c 100644
> > --- a/include/linux/page-flags.h
> > +++ b/include/linux/page-flags.h
> > @@ -933,6 +933,7 @@ enum pagetype {
> >       PGTY_slab       = 0xf5,
> >       PGTY_zsmalloc   = 0xf6,
> >       PGTY_unaccepted = 0xf7,
> > +     PGTY_guestmem   = 0xf8,
> >
> >       PGTY_mapcount_underflow = 0xff
> >  };
> > @@ -1082,6 +1083,21 @@ FOLIO_TYPE_OPS(hugetlb, hugetlb)
> >  FOLIO_TEST_FLAG_FALSE(hugetlb)
> >  #endif
> >
> > +/*
> > + * guestmem folios are used to back VM memory as managed by guest_memfd. Once
> > + * the last reference is put, instead of freeing these folios back to the page
> > + * allocator, they are returned to guest_memfd.
> > + *
> > + * For now, guestmem will only be set on these folios as long as they  cannot be
> > + * mapped to user space ("private state"), with the plan of always setting that
> > + * type once typed folios can be mapped to user space cleanly.
> > + */
> > +#ifdef CONFIG_KVM_GMEM_SHARED_MEM
> > +FOLIO_TYPE_OPS(guestmem, guestmem)
> > +#else
> > +FOLIO_TEST_FLAG_FALSE(guestmem)
> > +#endif
> > +
> >  PAGE_TYPE_OPS(Zsmalloc, zsmalloc, zsmalloc)
> >
> >  /*
> > diff --git a/mm/debug.c b/mm/debug.c
> > index 8d2acf432385..08bc42c6cba8 100644
> > --- a/mm/debug.c
> > +++ b/mm/debug.c
> > @@ -56,6 +56,7 @@ static const char *page_type_names[] = {
> >       DEF_PAGETYPE_NAME(table),
> >       DEF_PAGETYPE_NAME(buddy),
> >       DEF_PAGETYPE_NAME(unaccepted),
> > +     DEF_PAGETYPE_NAME(guestmem),
> >  };
> >
> >  static const char *page_type_name(unsigned int page_type)
> > diff --git a/mm/swap.c b/mm/swap.c
> > index 47bc1bb919cc..241880a46358 100644
> > --- a/mm/swap.c
> > +++ b/mm/swap.c
> > @@ -38,6 +38,10 @@
> >  #include <linux/local_lock.h>
> >  #include <linux/buffer_head.h>
> >
> > +#ifdef CONFIG_KVM_GMEM_SHARED_MEM
> > +#include <linux/kvm_host.h>
> > +#endif
> > +
> >  #include "internal.h"
> >
> >  #define CREATE_TRACE_POINTS
> > @@ -101,6 +105,11 @@ static void free_typed_folio(struct folio *folio)
> >       case PGTY_hugetlb:
> >               free_huge_folio(folio);
> >               return;
> > +#endif
> > +#ifdef CONFIG_KVM_GMEM_SHARED_MEM
> > +     case PGTY_guestmem:
> > +             kvm_gmem_handle_folio_put(folio);
> > +             return;
> >  #endif
> >       default:
> >               WARN_ON_ONCE(1);
> > diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
> > index 54e959e7d68f..37f7734cb10f 100644
> > --- a/virt/kvm/Kconfig
> > +++ b/virt/kvm/Kconfig
> > @@ -124,3 +124,8 @@ config HAVE_KVM_ARCH_GMEM_PREPARE
> >  config HAVE_KVM_ARCH_GMEM_INVALIDATE
> >         bool
> >         depends on KVM_PRIVATE_MEM
> > +
> > +config KVM_GMEM_SHARED_MEM
> > +       select KVM_PRIVATE_MEM
> > +       depends on !KVM_GENERIC_MEMORY_ATTRIBUTES
>
> Enforcing that KVM_GENERIC_MEMORY_ATTRIBUTES is not selected should not
> be a strict requirement. Fuad explained in an offline chat that this is
> just temporary.
>
> If we have CONFIG_GUESTMEM, then this question is moot, I think
> CONFIG_GUESTMEM would just be independent of everything else; other
> configs would depend on CONFIG_GUESTMEM.

There are two things here. First of all, the unfortunate naming
situation where PRIVATE could mean GUESTMEM, or private could mean not
shared. I plan to tackle this aspect (i.e., the naming) in a separate
patch series, since that will surely generate a lot of debate :)

The other part is that, with shared memory in-place, the memory
attributes are an orthogonal matter. The attributes are the userpace's
view of what it expects the state of the memory to be, and are used to
multiplex whether the memory being accessed is guest_memfd or the
regular (i.e., most likely anonymous) memory used normally by KVM.

This behavior however would be architecture, or even vm-type specific.

Cheers,
/fuad

> > +       bool




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [Linux for Sparc]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux