Re: folio_mmapped

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Monday 04 Mar 2024 at 22:58:49 (+0100), David Hildenbrand wrote:
> On 04.03.24 22:43, Elliot Berman wrote:
> > On Mon, Mar 04, 2024 at 09:17:05PM +0100, David Hildenbrand wrote:
> > > On 04.03.24 20:04, Sean Christopherson wrote:
> > > > On Mon, Mar 04, 2024, Quentin Perret wrote:
> > > > > > As discussed in the sub-thread, that might still be required.
> > > > > > 
> > > > > > One could think about completely forbidding GUP on these mmap'ed
> > > > > > guest-memfds. But likely, there might be use cases in the future where you
> > > > > > want to use GUP on shared memory inside a guest_memfd.
> > > > > > 
> > > > > > (the iouring example I gave might currently not work because
> > > > > > FOLL_PIN|FOLL_LONGTERM|FOLL_WRITE only works on shmem+hugetlb, and
> > > > > > guest_memfd will likely not be detected as shmem; 8ac268436e6d contains some
> > > > > > details)
> > > > > 
> > > > > Perhaps it would be wise to start with GUP being forbidden if the
> > > > > current users do not need it (not sure if that is the case in Android,
> > > > > I'll check) ? We can always relax this constraint later when/if the
> > > > > use-cases arise, which is obviously much harder to do the other way
> > > > > around.
> > > > 
> > > > +1000.  At least on the KVM side, I would like to be as conservative as possible
> > > > when it comes to letting anything other than the guest access guest_memfd.
> > > 
> > > So we'll have to do it similar to any occurrences of "secretmem" in gup.c.
> > > We'll have to see how to marry KVM guest_memfd with core-mm code similar to
> > > e.g., folio_is_secretmem().
> > > 
> > > IIRC, we might not be able to de-reference the actual mapping because it
> > > could get free concurrently ...
> > > 
> > > That will then prohibit any kind of GUP access to these pages, including
> > > reading/writing for ptrace/debugging purposes, for core dumping purposes
> > > etc. But at least, you know that nobody was able to optain page references
> > > using GUP that might be used for reading/writing later.
> > 
> > Do you have any concerns to add to enum mapping_flags, AS_NOGUP, and
> > replacing folio_is_secretmem() with a test of this bit instead of
> > comparing the a_ops? I think it scales better.
> 
> The only concern I have are races, but let's look into the details:
> 
> In GUP-fast, we can essentially race with unmap of folios, munmap() of VMAs
> etc.
> 
> We had a similar discussion recently about possible races. It's documented
> in folio_fast_pin_allowed() regarding disabled IRQs and RCU grace periods.
> 
> "inodes and thus their mappings are freed under RCU, which means the mapping
> cannot be freed beneath us and thus we can safely dereference it."
> 
> So if we follow the same rules as folio_fast_pin_allowed(), we can
> de-reference folio->mapping, for example comparing mapping->a_ops.
> 
> [folio_is_secretmem should better follow the same approach]

Resurecting this discussion, I had discussions internally and as it
turns out Android makes extensive use of vhost/vsock when communicating
with guest VMs, which requires GUP. So, my bad, not supporting GUP for
the pKVM variant of guest_memfd is a bit of a non-starter, we'll need to
support it from the start. But again this should be a matter of 'simply'
having a dedicated KVM exit reason so hopefully it's not too bad.

Thanks,
Quentin




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux