On Thu, Oct 20, 2022 at 1:14 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > > In yesterday's call, David brought up the case where we fallocate a file > in shmem, call mmap(MAP_PRIVATE) and then store to a page which is over > a hole. That currently causes shmem to allocate a page, zero-fill it, > then COW it, resulting in two pages being allocated when only the > COW page really needs to be allocated. > > The path we currently take through the MM when we take the page fault > looks like this (correct me if I'm wrong ...): > > handle_mm_fault() > __handle_mm_fault() > handle_pte_fault() > do_fault() > do_cow_fault() > __do_fault() > vm_ops->fault() > > ... which is where we come into shmem_fault(). Apart from the > horrendous hole-punch handling case, shmem_fault() is quite simple: > > err = shmem_get_folio_gfp(inode, vmf->pgoff, &folio, SGP_CACHE, > gfp, vma, vmf, &ret); > if (err) > return vmf_error(err); > vmf->page = folio_file_page(folio, vmf->pgoff); > return ret; > > What we could do here is detect this case. Something like: > > enum sgp_type sgp = SGP_CACHE; > > if ((vmf->flags & FAULT_FLAG_WRITE) && !(vma->vm_flags & VM_SHARED)) > sgp = SGP_READ; > err = shmem_get_folio_gfp(inode, vmf->pgoff, &folio, sgp, gfp, > vma, vmf, &ret); > if (err) > return vmf_error(err); > if (folio) > vmf->page = folio_file_page(folio, vmf->pgoff); > else > vmf->page = NULL; > return ret; > > and change do_cow_fault() like this: > > +++ b/mm/memory.c > @@ -4575,12 +4575,17 @@ static vm_fault_t do_cow_fault(struct vm_fault *vmf) > if (ret & VM_FAULT_DONE_COW) > return ret; > > - copy_user_highpage(vmf->cow_page, vmf->page, vmf->address, vma); > + if (vmf->page) > + copy_user_highpage(vmf->cow_page, vmf->page, vmf->address, vma); > + else > + clear_user_highpage(vmf->cow_page, vmf->address); > __SetPageUptodate(vmf->cow_page); > > ret |= finish_fault(vmf); > - unlock_page(vmf->page); > - put_page(vmf->page); > + if (vmf->page) { > + unlock_page(vmf->page); > + put_page(vmf->page); > + } > if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE | VM_FAULT_RETRY))) > goto uncharge_out; > return ret; > > ... I wrote the code directly in my email client; definitely not > compile-tested. But if this situation is causing a real problem for > someone, this would be a quick fix for them. > > Is this a real problem or just intellectual curiosity? Also, does > this need support for THPs being created directly, or is khugepaged > fixing it up afterwards good enough? AFAIK, THP is not supported for private mapping. >