On Sun, Dec 13, 2020 at 11:45:32PM +0800, Muchun Song wrote: > All the infrastructure is ready, so we introduce nr_free_vmemmap_pages > field in the hstate to indicate how many vmemmap pages associated with > a HugeTLB page that we can free to buddy allocator. And initialize it "can be freed to buddy allocator" > in the hugetlb_vmemmap_init(). This patch is actual enablement of the > feature. > > Signed-off-by: Muchun Song <songmuchun@xxxxxxxxxxxxx> > Acked-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx> With below nits addressed you can add: Reviewed-by: Oscar Salvador <osalvador@xxxxxxx> > static int __init early_hugetlb_free_vmemmap_param(char *buf) > { > + /* We cannot optimize if a "struct page" crosses page boundaries. */ > + if (!is_power_of_2(sizeof(struct page))) > + return 0; > + I wonder if we should report a warning in case someone wants to enable this feature and stuct page size it not power of 2. In case someone wonders why it does not work for him/her. > +void __init hugetlb_vmemmap_init(struct hstate *h) > +{ > + unsigned int nr_pages = pages_per_huge_page(h); > + unsigned int vmemmap_pages; > + > + if (!hugetlb_free_vmemmap_enabled) > + return; > + > + vmemmap_pages = (nr_pages * sizeof(struct page)) >> PAGE_SHIFT; > + /* > + * The head page and the first tail page are not to be freed to buddy > + * system, the others page will map to the first tail page. So there > + * are the remaining pages that can be freed. "the other pages will map to the first tail page, so they can be freed." > + * > + * Could RESERVE_VMEMMAP_NR be greater than @vmemmap_pages? It is true > + * on some architectures (e.g. aarch64). See Documentation/arm64/ > + * hugetlbpage.rst for more details. > + */ > + if (likely(vmemmap_pages > RESERVE_VMEMMAP_NR)) > + h->nr_free_vmemmap_pages = vmemmap_pages - RESERVE_VMEMMAP_NR; > + > + pr_info("can free %d vmemmap pages for %s\n", h->nr_free_vmemmap_pages, > + h->name); Maybe specify this is hugetlb code: pr_info("%s: blabla", __func__, ...) or pr_info("hugetlb: blalala", ...); although I am not sure whether we need that at all, or maybe just use pr_debug(). -- Oscar Salvador SUSE L3