On Wed, May 12, 2021 at 2:31 PM Mike Kravetz <mike.kravetz@xxxxxxxxxx> wrote: > > On 5/12/21 1:14 PM, Peter Xu wrote: > > On Wed, May 12, 2021 at 12:42:32PM -0700, Mina Almasry wrote: > >>>>> @@ -4868,30 +4869,39 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, > >>>>> + WARN_ON(*pagep); > >>>> > >>>> I don't think this warning works, because we do set *pagep, in the > >>>> copy_huge_page_from_user failure case. In that case, the following > >>>> happens: > >>>> > >>>> 1. We set *pagep, and return immediately. > >>>> 2. Our caller notices this particular error, drops mmap_lock, and then > >>>> calls us again with *pagep set. > >>>> > >>>> In this path, we're supposed to just re-use this existing *pagep > >>>> instead of allocating a second new page. > >>>> > >>>> I think this also means we need to keep the "else" case where *pagep > >>>> is set below. > >>>> > >>> > >>> +1 to Peter's comment. > >>> > > Apologies to Axel (and Peter) as that comment was from Axel. > > >> > >> Gah, sorry about that. I'll fix in v2. > > > > I have a question regarding v1: how do you guarantee huge_add_to_page_cache() > > won't fail again even if checked before page alloc? Say, what if the page > > cache got inserted after hugetlbfs_pagecache_present() (which is newly added in > > your v1) but before huge_add_to_page_cache()? > > In the caller (__mcopy_atomic_hugetlb) we obtain the hugetlb fault mutex > before calling this routine. This should prevent changes to the cache > while in the routine. > > However, things get complicated in the case where copy_huge_page_from_user > fails. In this case, we will return to the caller which will drop mmap_lock > and the hugetlb fault mutex before doing the copy. After dropping the > mutex, someone could populate the cache. This would result in the same > situation where two reserves are 'temporarily' consumed for the same > mapping offset. By the time we get to the second call to > hugetlb_mcopy_atomic_pte where the previously allocated page is passed > in, it is too late. > Thanks. I tried locally to allocate a page, then add it into the cache, *then* copy its contents (dropping that lock if that fails). That also has the test passing, but I'm not sure if I'm causing a fire somewhere else by having a page in the cache that has uninitialized contents. The only other code that checks the cache seems to be the hugetlb_fault/hugetlb_cow code. I'm reading that code to try to understand if I'm breaking that code doing this. > -- > Mike Kravetz