Mina, On Wed, May 12, 2021 at 12:42:32PM -0700, Mina Almasry wrote: > > >> @@ -4868,30 +4869,39 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, > > >> struct page **pagep) > > >> { > > >> bool is_continue = (mode == MCOPY_ATOMIC_CONTINUE); > > >> - struct address_space *mapping; > > >> - pgoff_t idx; > > >> + struct hstate *h = hstate_vma(dst_vma); > > >> + struct address_space *mapping = dst_vma->vm_file->f_mapping; > > >> + pgoff_t idx = vma_hugecache_offset(h, dst_vma, dst_addr); > > >> unsigned long size; > > >> int vm_shared = dst_vma->vm_flags & VM_SHARED; > > >> - struct hstate *h = hstate_vma(dst_vma); > > >> pte_t _dst_pte; > > >> spinlock_t *ptl; > > >> - int ret; > > >> + int ret = -ENOMEM; > > >> struct page *page; > > >> int writable; > > >> > > >> - mapping = dst_vma->vm_file->f_mapping; > > >> - idx = vma_hugecache_offset(h, dst_vma, dst_addr); > > >> + /* Out parameter. */ > > >> + WARN_ON(*pagep); > > > > > > I don't think this warning works, because we do set *pagep, in the > > > copy_huge_page_from_user failure case. In that case, the following > > > happens: > > > > > > 1. We set *pagep, and return immediately. > > > 2. Our caller notices this particular error, drops mmap_lock, and then > > > calls us again with *pagep set. > > > > > > In this path, we're supposed to just re-use this existing *pagep > > > instead of allocating a second new page. > > > > > > I think this also means we need to keep the "else" case where *pagep > > > is set below. > > > > > > > +1 to Peter's comment. > > > > Gah, sorry about that. I'll fix in v2. I have a question regarding v1: how do you guarantee huge_add_to_page_cache() won't fail again even if checked before page alloc? Say, what if the page cache got inserted after hugetlbfs_pagecache_present() (which is newly added in your v1) but before huge_add_to_page_cache()? I also have a feeling that we've been trying to work around something else, but I can't tell yet as I'll probably need to read a bit more/better on how hugetlb does the accounting and also on reservations. Thanks, -- Peter Xu