On Wed, Nov 20, 2024 at 6:02 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > On Wed, Nov 20, 2024 at 08:20:16AM -0800, Boqun Feng wrote: > > On Wed, Nov 20, 2024 at 10:10:44AM +0100, Alice Ryhl wrote: > > > On Wed, Nov 20, 2024 at 5:57 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > > > > We don't have a fully formed destination yet, so I can't give you a > > > > definite answer to a lot of questions. Obviously I don't want to hold > > > > up the Rust project in any way, but I need to know that what we're trying > > > > to do will be expressible in Rust. > > > > > > > > Can we avoid referring to a page's refcount? > > > > > > I don't think this patch needs the refcount at all, and the previous > > > version did not expose it. This came out of the advice to use put_page > > > over free_page. Does this mean that we should switch to put_page but > > > not use get_page? > > Did I advise using put_page() over free_page()? I hope I didn't say > that. I don't see a reason why binder needs to refcount its pages (nor > use a mapcount on them), but I don't fully understand binder so maybe > it does need a refcount. I think that was me, at <https://lore.kernel.org/all/CAG48ez32zWt4mcfA+y2FnzzNmFe-0ns9XQgp=QYeFpRsdiCAnw@xxxxxxxxxxxxxx/>. Looking at the C binder version, binder_install_single_page() installs pages into userspace page tables in a VM_MIXEDMAP mapping using vm_insert_page(), and when you do that with pages from the page allocator, userspace can grab references to them through GUP-fast (and I think also through GUP). (See how vm_insert_page() and vm_get_page_prot() don't use pte_mkspecial(), which is pretty much the only thing that can stop GUP-fast on most architectures.) My understanding is that the combination VM_IO|VM_MIXEDMAP would stop normal GUP, but currently the only way to block GUP-fast is to use VM_PFNMAP. (Which, as far as I understand, is also why GPU drivers use VM_PFNMAP so much.) Maybe we should change that, so that VM_IO and/or VM_MIXEDMAP blocks GUP in the region and causes installed PTEs to be marked with pte_mkspecial()? I am not entirely sure about this stuff, but I was recently looking at net/packet/af_packet.c, and I tested that vmsplice() can grab references to the high-order compound pages that alloc_one_pg_vec_page() allocates with __get_free_pages(GFP_KERNEL | __GFP_COMP | __GFP_ZERO | __GFP_NOWARN | __GFP_NORETRY, order), packet_mmap() inserts with vm_insert_page(), and free_pg_vec() drops with free_pages(). (But that all happens to actually work fine, free_pages() actually handles refcounted compound pages properly.)