On 5/4/22 19:35, Muchun Song wrote: > On Wed, May 04, 2022 at 03:12:39PM -0700, Mike Kravetz wrote: >> On 4/29/22 05:18, Muchun Song wrote: >>> +static void vmemmap_optimize_mode_switch(enum vmemmap_optimize_mode to) >>> +{ >>> + if (vmemmap_optimize_mode == to) >>> + return; >>> + >>> + if (to == VMEMMAP_OPTIMIZE_OFF) >>> + static_branch_dec(&hugetlb_optimize_vmemmap_key); >>> + else >>> + static_branch_inc(&hugetlb_optimize_vmemmap_key); >>> + vmemmap_optimize_mode = to; >>> +} >>> + >>> static int __init hugetlb_vmemmap_early_param(char *buf) >>> { >>> bool enable; >>> + enum vmemmap_optimize_mode mode; >>> >>> if (kstrtobool(buf, &enable)) >>> return -EINVAL; >>> >>> - if (enable) >>> - static_branch_enable(&hugetlb_optimize_vmemmap_key); >>> - else >>> - static_branch_disable(&hugetlb_optimize_vmemmap_key); >>> + mode = enable ? VMEMMAP_OPTIMIZE_ON : VMEMMAP_OPTIMIZE_OFF; >>> + vmemmap_optimize_mode_switch(mode); >>> >>> return 0; >>> } >>> @@ -60,6 +80,8 @@ int hugetlb_vmemmap_alloc(struct hstate *h, struct page *head) >>> vmemmap_end = vmemmap_addr + (vmemmap_pages << PAGE_SHIFT); >>> vmemmap_reuse = vmemmap_addr - PAGE_SIZE; >>> >>> + VM_BUG_ON_PAGE(!vmemmap_pages, head); >>> + >>> /* >>> * The pages which the vmemmap virtual address range [@vmemmap_addr, >>> * @vmemmap_end) are mapped to are freed to the buddy allocator, and >>> @@ -69,8 +91,10 @@ int hugetlb_vmemmap_alloc(struct hstate *h, struct page *head) >>> */ >>> ret = vmemmap_remap_alloc(vmemmap_addr, vmemmap_end, vmemmap_reuse, >>> GFP_KERNEL | __GFP_NORETRY | __GFP_THISNODE); >>> - if (!ret) >>> + if (!ret) { >>> ClearHPageVmemmapOptimized(head); >>> + static_branch_dec(&hugetlb_optimize_vmemmap_key); >>> + } >>> >>> return ret; >>> } >>> @@ -84,6 +108,8 @@ void hugetlb_vmemmap_free(struct hstate *h, struct page *head) >>> if (!vmemmap_pages) >>> return; >>> >>> + static_branch_inc(&hugetlb_optimize_vmemmap_key); >> >> Can you explain the reasoning behind doing the static_branch_inc here in free, >> and static_branch_dec in alloc? >> IIUC, they may not be absolutely necessary but you could use the count to >> know how many optimized pages are in use? Or, I may just be missing >> something. >> > > Partly right. One 'count' is not enough. I have implemented this with similar > approach in v6 [1]. Except the 'count', we also need a lock to do synchronization. > However, both count and synchronization are included in static_key_inc/dec > infrastructure. It is simpler to use static_key_inc/dec directly, right? > > [1] https://lore.kernel.org/all/20220330153745.20465-5-songmuchun@xxxxxxxxxxxxx/ > Sorry, but I am a little confused. vmemmap_optimize_mode_switch will static_key_inc to enable and static_key_dec to disable. In addition each time we optimize (allocate) a hugetlb page after enabling we will static_key_inc. Suppose we have 1 hugetlb page optimized. So static count == 2 IIUC. The someone turns off optimization via sysctl. static count == 1 ??? If we then add another hugetlb page via nr_hugepages it seems that it would be optimized as static count == 1. Is that correct? Do we need to free all hugetlb pages with optimization before we can add new pages without optimization? -- Mike Kravetz