On 5/20/21 12:21 PM, Mina Almasry wrote: > On Thu, May 20, 2021 at 12:18 PM Mina Almasry <almasrymina@xxxxxxxxxx> wrote: >> >> On Thu, May 13, 2021 at 5:14 PM Mike Kravetz <mike.kravetz@xxxxxxxxxx> wrote: >>> >>> How about this approach? >>> - Keep the check for hugetlbfs_pagecache_present in hugetlb_mcopy_atomic_pte >>> that you added. That will catch the race where the page was added to >>> the cache before entering the routine. >>> - With the above check in place, we only need to worry about the case >>> where copy_huge_page_from_user fails and we must drop locks. In this >>> case we: >>> - Free the page previously allocated. >>> - Allocate a 'temporary' huge page without consuming reserves. I'm >>> thinking of something similar to page migration. >>> - Drop the locks and let the copy_huge_page_from_user be done to the >>> temporary page. >>> - When reentering hugetlb_mcopy_atomic_pte after dropping locks (the >>> *pagep case) we need to once again check >>> hugetlbfs_pagecache_present. >>> - We then try to allocate the huge page which will consume the >>> reserve. If successful, copy contents of temporary page to newly >>> allocated page. Free temporary page. >>> >>> There may be issues with this, and I have not given it deep thought. It >>> does abuse the temporary huge page concept, but perhaps no more than >>> page migration. Things do slow down if the extra page allocation and >>> copy is required, but that would only be the case if copy_huge_page_from_user >>> needs to be done without locks. Not sure, but hoping that is rare. >> >> Just following up this a bit: I've implemented this approach locally, >> and with it it's passing the test as-is. When I hack the code such >> that the copy in hugetlb_mcopy_atomic_pte() always fails, I run into >> this edge case, which causes resv_huge_pages to underflow again (this >> time permemantly): >> >> - hugetlb_no_page() is called on an index and a page is allocated and >> inserted into the cache consuming the reservation. >> - remove_huge_page() is called on this index and the page is removed from cache. >> - hugetlb_mcopy_atomic_pte() is called on this index, we do not find >> the page in the cache and we trigger this code patch and the copy >> fails. >> - The allocations in this code path seem to double consume the >> reservation and resv_huge_pages underflows. >> >> I'm looking at this edge case to understand why a prior >> remove_huge_page() causes my code to underflow resv_huge_pages. >> > > I should also mention, without a prior remove_huge_page() this code > path works fine, so it seems the fact that the reservation is consumed > before causes trouble, but I'm not sure why yet. > Hi Mina, How about quickly posting the code? I may be able to provide more suggestions if I can see the actual code. -- Mike Kravetz