> On Sep 7, 2023, at 05:38, Mike Kravetz <mike.kravetz@xxxxxxxxxx> wrote: > > On 09/06/23 15:38, Muchun Song wrote: >> >> >> On 2023/9/6 05:44, Mike Kravetz wrote: >>> Now that batching of hugetlb vmemmap optimization processing is possible, >>> batch the freeing of vmemmap pages. When freeing vmemmap pages for a >>> hugetlb page, we add them to a list that is freed after the entire batch >>> has been processed. >>> >>> This enhances the ability to return contiguous ranges of memory to the >>> low level allocators. >>> >>> Signed-off-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx> >>> --- >>> mm/hugetlb_vmemmap.c | 60 ++++++++++++++++++++++++++++---------------- >>> 1 file changed, 38 insertions(+), 22 deletions(-) >>> >>> diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c >>> index 79de984919ef..a715712df831 100644 >>> --- a/mm/hugetlb_vmemmap.c >>> +++ b/mm/hugetlb_vmemmap.c >>> @@ -306,18 +306,21 @@ static void vmemmap_restore_pte(pte_t *pte, unsigned long addr, >>> * @end: end address of the vmemmap virtual address range that we want to >>> * remap. >>> * @reuse: reuse address. >>> + * @vmemmap_pages: list to deposit vmemmap pages to be freed. It is callers >>> + * responsibility to free pages. >>> * >>> * Return: %0 on success, negative error code otherwise. >>> */ >>> static int vmemmap_remap_free(unsigned long start, unsigned long end, >>> - unsigned long reuse) >>> + unsigned long reuse, >>> + struct list_head *vmemmap_pages) >>> { >>> int ret; >>> - LIST_HEAD(vmemmap_pages); >>> + LIST_HEAD(freed_pages); >> >> IIUC, we could reuse the parameter of @vmemmap_pages directly instead of >> a temporary variable, could it be dropped? >> > > I was concerned about the error case where we call vmemmap_remap_range a > second time. In the first call to vmemmap_remap_range with vmemmap_remap_pte, > vmemmap pages to be freed are added to the end of the list (list_add_tail). > In the call to vmemmap_remap_range after error with vmemmap_restore_pte, > pages are taken off the head of the list (list_first_entry). So, it seems > that it would be possible to use a different set of pages in the restore Yes. > operation. This would be an issue if pages had different characteristics such > as being on different nodes. Is that a real concern? A good point. Now I see your concern, it is better to keep the same node as before when error occurs. > > I suppose we could change vmemmap_remap_pte to add pages to the head of > the list? I do not recall the reasoning behind adding to tail. I think we could do this, the code will be a little simple. Actually, there is no reason behind adding to tail (BTW, the first commit is introduced by me, no secret here :-)). Thanks. > -- > Mike Kravetz