On Fri, Mar 19, 2021 at 03:42:08PM -0700, Mike Kravetz wrote: > The locks acquired in free_huge_page are irq safe. However, in certain > circumstances the routine update_and_free_page could sleep. Since > free_huge_page can be called from any context, it can not sleep. > > Use a waitqueue to defer freeing of pages if the operation may sleep. A > new routine update_and_free_page_no_sleep provides this functionality > and is only called from free_huge_page. > > Note that any 'pages' sent to the workqueue for deferred freeing have > already been removed from the hugetlb subsystem. What is actually > deferred is returning those base pages to the low level allocator. So maybe I'm stupid, but why do you need that work in hugetlb? Afaict it should be in cma_release(). Also, afaict cma_release() does free_contig_range() *first*, and then does the 'difficult' bits. So how about you re-order free_gigantic_page() a bit to make it unconditionally do free_contig_range() and *then* call into CMA, which can then do a workqueue thingy if it feels like it. That way none of the hugetlb accounting is delayed, and only CMA gets to suffer.