Re: folio_mmapped

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 04.03.24 22:43, Elliot Berman wrote:
On Mon, Mar 04, 2024 at 09:17:05PM +0100, David Hildenbrand wrote:
On 04.03.24 20:04, Sean Christopherson wrote:
On Mon, Mar 04, 2024, Quentin Perret wrote:
As discussed in the sub-thread, that might still be required.

One could think about completely forbidding GUP on these mmap'ed
guest-memfds. But likely, there might be use cases in the future where you
want to use GUP on shared memory inside a guest_memfd.

(the iouring example I gave might currently not work because
FOLL_PIN|FOLL_LONGTERM|FOLL_WRITE only works on shmem+hugetlb, and
guest_memfd will likely not be detected as shmem; 8ac268436e6d contains some
details)

Perhaps it would be wise to start with GUP being forbidden if the
current users do not need it (not sure if that is the case in Android,
I'll check) ? We can always relax this constraint later when/if the
use-cases arise, which is obviously much harder to do the other way
around.

+1000.  At least on the KVM side, I would like to be as conservative as possible
when it comes to letting anything other than the guest access guest_memfd.

So we'll have to do it similar to any occurrences of "secretmem" in gup.c.
We'll have to see how to marry KVM guest_memfd with core-mm code similar to
e.g., folio_is_secretmem().

IIRC, we might not be able to de-reference the actual mapping because it
could get free concurrently ...

That will then prohibit any kind of GUP access to these pages, including
reading/writing for ptrace/debugging purposes, for core dumping purposes
etc. But at least, you know that nobody was able to optain page references
using GUP that might be used for reading/writing later.

Do you have any concerns to add to enum mapping_flags, AS_NOGUP, and
replacing folio_is_secretmem() with a test of this bit instead of
comparing the a_ops? I think it scales better.

The only concern I have are races, but let's look into the details:

In GUP-fast, we can essentially race with unmap of folios, munmap() of VMAs etc.

We had a similar discussion recently about possible races. It's documented in folio_fast_pin_allowed() regarding disabled IRQs and RCU grace periods.

"inodes and thus their mappings are freed under RCU, which means the mapping cannot be freed beneath us and thus we can safely dereference it."

So if we follow the same rules as folio_fast_pin_allowed(), we can de-reference folio->mapping, for example comparing mapping->a_ops.

[folio_is_secretmem should better follow the same approach]

So likely that should just work!

--
Cheers,

David / dhildenb





[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux