On Sat, Feb 12, 2022 at 6:08 PM Muchun Song <songmuchun@xxxxxxxxxxxxx> wrote: > > On Fri, Feb 11, 2022 at 8:37 PM Joao Martins <joao.m.martins@xxxxxxxxxx> wrote: > > > > On 2/11/22 07:54, Muchun Song wrote: > > > On Fri, Feb 11, 2022 at 3:34 AM Joao Martins <joao.m.martins@xxxxxxxxxx> wrote: > > > [...] > > >> pte_t * __meminit vmemmap_pte_populate(pmd_t *pmd, unsigned long addr, int node, > > >> - struct vmem_altmap *altmap) > > >> + struct vmem_altmap *altmap, > > >> + struct page *block) > > > > > > Why not use the name of "reuse" instead of "block"? > > > Seems like "reuse" is more clear. > > > > > Good idea, let me rename that to @reuse. > > > > >> { > > >> pte_t *pte = pte_offset_kernel(pmd, addr); > > >> if (pte_none(*pte)) { > > >> pte_t entry; > > >> void *p; > > >> > > >> - p = vmemmap_alloc_block_buf(PAGE_SIZE, node, altmap); > > >> - if (!p) > > >> - return NULL; > > >> + if (!block) { > > >> + p = vmemmap_alloc_block_buf(PAGE_SIZE, node, altmap); > > >> + if (!p) > > >> + return NULL; > > >> + } else { > > >> + /* > > >> + * When a PTE/PMD entry is freed from the init_mm > > >> + * there's a a free_pages() call to this page allocated > > >> + * above. Thus this get_page() is paired with the > > >> + * put_page_testzero() on the freeing path. > > >> + * This can only called by certain ZONE_DEVICE path, > > >> + * and through vmemmap_populate_compound_pages() when > > >> + * slab is available. > > >> + */ > > >> + get_page(block); > > >> + p = page_to_virt(block); > > >> + } > > >> entry = pfn_pte(__pa(p) >> PAGE_SHIFT, PAGE_KERNEL); > > >> set_pte_at(&init_mm, addr, pte, entry); > > >> } > > >> @@ -609,7 +624,8 @@ pgd_t * __meminit vmemmap_pgd_populate(unsigned long addr, int node) > > >> } > > >> > > >> static int __meminit vmemmap_populate_address(unsigned long addr, int node, > > >> - struct vmem_altmap *altmap) > > >> + struct vmem_altmap *altmap, > > >> + struct page *reuse, struct page **page) > > > > > > We can remove the last argument (struct page **page) if we change > > > the return type to "pte_t *". More simple, don't you think? > > > > > > > Hmmm, perhaps it is simpler, specially provided the only error code is ENOMEM. > > > > Albeit perhaps what we want is a `struct page *` rather than a pte. > > The caller can extract `struct page` from a pte. > > [...] > > > >> - if (vmemmap_populate(start, end, nid, altmap)) > > >> + if (pgmap && pgmap_vmemmap_nr(pgmap) > 1 && !altmap) > > > > > > Should we add a judgment like "is_power_of_2(sizeof(struct page))" since > > > this optimization is only applied when the size of the struct page does not > > > cross page boundaries? > > > > Totally miss that -- let me make that adjustment. > > > > Can I ask which architectures/conditions this happens? > > E.g. arm64 when !CONFIG_MEMCG. Plus !CONFIG_SLUB even on x86_64. > > Thanks.