Re: [PATCH v4 03/10] KVM: guest_memfd: Allow host to map guest_memfd() pages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 20.02.25 18:10, Fuad Tabba wrote:
On Thu, 20 Feb 2025 at 15:58, David Hildenbrand <david@xxxxxxxxxx> wrote:

On 20.02.25 16:45, Fuad Tabba wrote:
Hi David,

On Thu, 20 Feb 2025 at 12:04, Fuad Tabba <tabba@xxxxxxxxxx> wrote:

On Thu, 20 Feb 2025 at 11:58, David Hildenbrand <david@xxxxxxxxxx> wrote:

On 18.02.25 18:24, Fuad Tabba wrote:
Add support for mmap() and fault() for guest_memfd backed memory
in the host for VMs that support in-place conversion between
shared and private. To that end, this patch adds the ability to
check whether the VM type supports in-place conversion, and only
allows mapping its memory if that's the case.

This behavior is also gated by the configuration option
KVM_GMEM_SHARED_MEM.

Signed-off-by: Fuad Tabba <tabba@xxxxxxxxxx>
---
    include/linux/kvm_host.h |  11 +++++
    virt/kvm/guest_memfd.c   | 103 +++++++++++++++++++++++++++++++++++++++
    2 files changed, 114 insertions(+)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 3ad0719bfc4f..f9e8b10a4b09 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -728,6 +728,17 @@ static inline bool kvm_arch_has_private_mem(struct kvm *kvm)
    }
    #endif

+/*
+ * Arch code must define kvm_arch_gmem_supports_shared_mem if support for
+ * private memory is enabled and it supports in-place shared/private conversion.
+ */
+#if !defined(kvm_arch_gmem_supports_shared_mem) && !IS_ENABLED(CONFIG_KVM_PRIVATE_MEM)
+static inline bool kvm_arch_gmem_supports_shared_mem(struct kvm *kvm)
+{
+     return false;
+}
+#endif
+
    #ifndef kvm_arch_has_readonly_mem
    static inline bool kvm_arch_has_readonly_mem(struct kvm *kvm)
    {
diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
index c6f6792bec2a..30b47ff0e6d2 100644
--- a/virt/kvm/guest_memfd.c
+++ b/virt/kvm/guest_memfd.c
@@ -317,9 +317,112 @@ void kvm_gmem_handle_folio_put(struct folio *folio)
    {
        WARN_ONCE(1, "A placeholder that shouldn't trigger. Work in progress.");
    }
+
+static bool kvm_gmem_offset_is_shared(struct file *file, pgoff_t index)
+{
+     struct kvm_gmem *gmem = file->private_data;
+
+     /* For now, VMs that support shared memory share all their memory. */
+     return kvm_arch_gmem_supports_shared_mem(gmem->kvm);
+}
+
+static vm_fault_t kvm_gmem_fault(struct vm_fault *vmf)
+{
+     struct inode *inode = file_inode(vmf->vma->vm_file);
+     struct folio *folio;
+     vm_fault_t ret = VM_FAULT_LOCKED;
+
+     filemap_invalidate_lock_shared(inode->i_mapping);
+
+     folio = kvm_gmem_get_folio(inode, vmf->pgoff);
+     if (IS_ERR(folio)) {
+             switch (PTR_ERR(folio)) {
+             case -EAGAIN:
+                     ret = VM_FAULT_RETRY;
+                     break;
+             case -ENOMEM:
+                     ret = VM_FAULT_OOM;
+                     break;
+             default:
+                     ret = VM_FAULT_SIGBUS;
+                     break;
+             }
+             goto out_filemap;
+     }
+
+     if (folio_test_hwpoison(folio)) {
+             ret = VM_FAULT_HWPOISON;
+             goto out_folio;
+     }
+
+     /* Must be called with folio lock held, i.e., after kvm_gmem_get_folio() */
+     if (!kvm_gmem_offset_is_shared(vmf->vma->vm_file, vmf->pgoff)) {
+             ret = VM_FAULT_SIGBUS;
+             goto out_folio;
+     }
+
+     /*
+      * Only private folios are marked as "guestmem" so far, and we never
+      * expect private folios at this point.
+      */
+     if (WARN_ON_ONCE(folio_test_guestmem(folio)))  {
+             ret = VM_FAULT_SIGBUS;
+             goto out_folio;
+     }
+
+     /* No support for huge pages. */
+     if (WARN_ON_ONCE(folio_test_large(folio))) {
+             ret = VM_FAULT_SIGBUS;
+             goto out_folio;
+     }
+
+     if (!folio_test_uptodate(folio)) {
+             clear_highpage(folio_page(folio, 0));
+             kvm_gmem_mark_prepared(folio);
+     }

kvm_gmem_get_pfn()->__kvm_gmem_get_pfn() seems to call
kvm_gmem_prepare_folio() instead.

Could we do the same here?

Will do.

I realized it's not that straightforward. __kvm_gmem_prepare_folio()
requires the kvm_memory_slot, which is used to calculate the gfn. At
that point we have neither, and it's not just an issue of access, but
there might not be a slot associated with that yet.

Hmm, right ... I wonder if that might be problematic. I assume no
memslot == no memory attribute telling us if it is private or shared at
least for now?

Once guest_memfd maintains that state, it might be "cleaner" ? What's
your thought?

The idea is that this doesn't determine whether it's shared or private
by the guest_memfd's attributes, but by the new state added in the
other patch series. That's independent of memslots and guest addresses
altogether.

One scenario you can imagine is the host wanting to fault in memory to
initialize it before associating it with a memslot. I guess we could
make it a requirement that you cannot fault-in pages unless they are
associated with a memslot, but that might be too restrictive.

Okay, just what I thought, thanks!

--
Cheers,

David / dhildenb





[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux