Re: [PATCH RFC 4/4] mm: guest_memfd: Add ability for mmap'ing pages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 15.08.24 09:24, Fuad Tabba wrote:
Hi David,

Hi!


On Tue, 6 Aug 2024 at 14:51, David Hildenbrand <david@xxxxxxxxxx> wrote:


-     if (gmem_flags & GUEST_MEMFD_FLAG_NO_DIRECT_MAP) {
+     if (!ops->accessible && (gmem_flags & GUEST_MEMFD_FLAG_NO_DIRECT_MAP)) {
               r = guest_memfd_folio_private(folio);
               if (r)
                       goto out_err;
@@ -107,6 +109,82 @@ struct folio *guest_memfd_grab_folio(struct file *file, pgoff_t index, u32 flags
   }
   EXPORT_SYMBOL_GPL(guest_memfd_grab_folio);

+int guest_memfd_make_inaccessible(struct file *file, struct folio *folio)
+{
+     unsigned long gmem_flags = (unsigned long)file->private_data;
+     unsigned long i;
+     int r;
+
+     unmap_mapping_folio(folio);
+
+     /**
+      * We can't use the refcount. It might be elevated due to
+      * guest/vcpu trying to access same folio as another vcpu
+      * or because userspace is trying to access folio for same reason

As discussed, that's insufficient. We really have to drive the refcount
to 1 -- the single reference we expect.

What is the exact problem you are running into here? Who can just grab a
reference and maybe do nasty things with it?

I was wondering, why do we need to check the refcount? Isn't it enough
to check for page_mapped() || page_maybe_dma_pinned(), while holding
the folio lock?

(folio_mapped() + folio_maybe_dma_pinned())

Not everything goes trough FOLL_PIN. vmsplice() is an example, or just some very simple read/write through /proc/pid/mem. Further, some O_DIRECT implementations still don't use FOLL_PIN.

So if you see an additional folio reference, as soon as you mapped that thing to user space, you have to assume that it could be someone reading/writing that memory in possibly sane context. (vmsplice() should be using FOLL_PIN|FOLL_LONGTERM, but that's a longer discussion)

(noting that also folio_maybe_dma_pinned() can have false positives in some cases due to speculative references or *many* references).

--
Cheers,

David / dhildenb





[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux