On 10/28/20 12:26 AM, Muchun Song wrote: > On Wed, Oct 28, 2020 at 8:33 AM Mike Kravetz <mike.kravetz@xxxxxxxxxx> wrote: >> On 10/26/20 7:51 AM, Muchun Song wrote: >> >> I see the following routines follow the pattern for vmemmap manipulation >> in dax. > > Did you mean move those functions to mm/sparse-vmemmap.c? No. Sorry, that was mostly a not to myself. >>> +static void vmemmap_pgtable_deposit(struct page *page, pte_t *pte_p) >>> +{ >>> + pgtable_t pgtable = virt_to_page(pte_p); >>> + >>> + /* FIFO */ >>> + if (!page_huge_pte(page)) >>> + INIT_LIST_HEAD(&pgtable->lru); >>> + else >>> + list_add(&pgtable->lru, &page_huge_pte(page)->lru); >>> + page_huge_pte(page) = pgtable; >>> +} >>> + >>> +static pte_t *vmemmap_pgtable_withdraw(struct page *page) >>> +{ >>> + pgtable_t pgtable; >>> + >>> + /* FIFO */ >>> + pgtable = page_huge_pte(page); >>> + if (unlikely(!pgtable)) >>> + return NULL; >>> + page_huge_pte(page) = list_first_entry_or_null(&pgtable->lru, >>> + struct page, lru); >>> + if (page_huge_pte(page)) >>> + list_del(&pgtable->lru); >>> + return page_to_virt(pgtable); >>> +} >>> + ... >>> @@ -1783,6 +1892,14 @@ static struct page *alloc_fresh_huge_page(struct hstate *h, >>> if (!page) >>> return NULL; >>> >>> + if (vmemmap_pgtable_prealloc(h, page)) { >>> + if (hstate_is_gigantic(h)) >>> + free_gigantic_page(page, huge_page_order(h)); >>> + else >>> + put_page(page); >>> + return NULL; >>> + } >>> + >> >> It seems a bit strange that we will fail a huge page allocation if >> vmemmap_pgtable_prealloc fails. Not sure, but it almost seems like we shold >> allow the allocation and log a warning? It is somewhat unfortunate that >> we need to allocate a page to free pages. > > Yeah, it seems unfortunate. But if we allocate success, we can free some > vmemmap pages later. Like a compromise :) . If we can successfully allocate > a huge page, I also prefer to be able to successfully allocate another one page. > If we allow the allocation when vmemmap_pgtable_prealloc fails, we also > need to mark this page that vmemmap has not been released. Seems > increase complexity. Yes, I think it is better to leave code as it is and avoid complexity. -- Mike Kravetz