On 09/06/23 15:38, Muchun Song wrote: > > > On 2023/9/6 05:44, Mike Kravetz wrote: > > Now that batching of hugetlb vmemmap optimization processing is possible, > > batch the freeing of vmemmap pages. When freeing vmemmap pages for a > > hugetlb page, we add them to a list that is freed after the entire batch > > has been processed. > > > > This enhances the ability to return contiguous ranges of memory to the > > low level allocators. > > > > Signed-off-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx> > > --- > > mm/hugetlb_vmemmap.c | 60 ++++++++++++++++++++++++++++---------------- > > 1 file changed, 38 insertions(+), 22 deletions(-) > > > > diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c > > index 79de984919ef..a715712df831 100644 > > --- a/mm/hugetlb_vmemmap.c > > +++ b/mm/hugetlb_vmemmap.c > > @@ -306,18 +306,21 @@ static void vmemmap_restore_pte(pte_t *pte, unsigned long addr, > > * @end: end address of the vmemmap virtual address range that we want to > > * remap. > > * @reuse: reuse address. > > + * @vmemmap_pages: list to deposit vmemmap pages to be freed. It is callers > > + * responsibility to free pages. > > * > > * Return: %0 on success, negative error code otherwise. > > */ > > static int vmemmap_remap_free(unsigned long start, unsigned long end, > > - unsigned long reuse) > > + unsigned long reuse, > > + struct list_head *vmemmap_pages) > > { > > int ret; > > - LIST_HEAD(vmemmap_pages); > > + LIST_HEAD(freed_pages); > > IIUC, we could reuse the parameter of @vmemmap_pages directly instead of > a temporary variable, could it be dropped? > I was concerned about the error case where we call vmemmap_remap_range a second time. In the first call to vmemmap_remap_range with vmemmap_remap_pte, vmemmap pages to be freed are added to the end of the list (list_add_tail). In the call to vmemmap_remap_range after error with vmemmap_restore_pte, pages are taken off the head of the list (list_first_entry). So, it seems that it would be possible to use a different set of pages in the restore operation. This would be an issue if pages had different characteristics such as being on different nodes. Is that a real concern? I suppose we could change vmemmap_remap_pte to add pages to the head of the list? I do not recall the reasoning behind adding to tail. -- Mike Kravetz