On 07/12/23 16:03, Muchun Song wrote: > > > > On Jul 12, 2023, at 06:09, Mike Kravetz <mike.kravetz@xxxxxxxxxx> wrote: > > > > Freeing a hugetlb page and releasing base pages back to the underlying > > allocator such as buddy or cma is performed in two steps: > > - remove_hugetlb_folio() is called to remove the folio from hugetlb > > lists, get a ref on the page and remove hugetlb destructor. This > > all must be done under the hugetlb lock. After this call, the page > > can be treated as a normal compound page or a collection of base > > size pages. > > - update_and_free_hugetlb_folio() is called to allocate vmemmap if > > needed and the free routine of the underlying allocator is called > > on the resulting page. We can not hold the hugetlb lock here. > > > > One issue with this scheme is that a memory error could occur between > > these two steps. In this case, the memory error handling code treats > > the old hugetlb page as a normal compound page or collection of base > > pages. It will then try to SetPageHWPoison(page) on the page with an > > error. If the page with error is a tail page without vmemmap, a write > > error will occur when trying to set the flag. > > > > Address this issue by modifying remove_hugetlb_folio() and > > update_and_free_hugetlb_folio() such that the hugetlb destructor is not > > cleared until after allocating vmemmap. Since clearing the destructor > > requires holding the hugetlb lock, the clearing is done in > > remove_hugetlb_folio() if the vmemmap is present. This saves a > > lock/unlock cycle. Otherwise, destructor is cleared in > > update_and_free_hugetlb_folio() after allocating vmemmap. > > > > Note that this will leave hugetlb pages in a state where they are marked > > free (by hugetlb specific page flag) and have a ref count. This is not > > a normal state. The only code that would notice is the memory error > > code, and it is set up to retry in such a case. > > > > A subsequent patch will create a routine to do bulk processing of > > vmemmap allocation. This will eliminate a lock/unlock cycle for each > > hugetlb page in the case where we are freeing a large number of pages. > > > > Fixes: ad2fa3717b74 ("mm: hugetlb: alloc the vmemmap pages associated with each HugeTLB page") > > Cc: <stable@xxxxxxxxxxxxxxx> > > Signed-off-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx> > > Hi Mike, > > I have seen an issue proposed by Jiaqi Yan in [1]. I didn't see any > resolution for it. Am I missing something with this fix? > > [1] https://lore.kernel.org/linux-mm/CACw3F53iPiLrJt4pyaX2aaZ5BVg9tj8x_k6-v7=9Xn1nrh=UCw@xxxxxxxxxxxxxx/ > My mistake! I sent the old version of the patch. The new version was modified to simply check the destructor via folio_test_hugetlb() in order to decide if it should be cleared. I will send V2. Sorry! -- Mike Kravetz