On 2022/6/12 23:44, Muchun Song wrote: > On Sat, Jun 11, 2022 at 10:13:52AM +0800, Miaohe Lin wrote: >> Since commit 5232c63f46fd ("mm: Make compound_pincount always available"), >> compound_pincount_ptr is stored at first tail page now. So we should call >> prep_compound_head() after the first tail page is initialized to take >> advantage of the likelihood of that tail struct page being cached given >> that we will read them right after in prep_compound_head(). >> >> Signed-off-by: Miaohe Lin <linmiaohe@xxxxxxxxxx> >> Cc: Joao Martins <joao.m.martins@xxxxxxxxxx> >> --- >> v2: >> Don't move prep_compound_head() outside loop per Joao. >> --- >> mm/page_alloc.c | 17 +++++++++++------ >> 1 file changed, 11 insertions(+), 6 deletions(-) >> >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >> index 4c7d99ee58b4..048df5d78add 100644 >> --- a/mm/page_alloc.c >> +++ b/mm/page_alloc.c >> @@ -6771,13 +6771,18 @@ static void __ref memmap_init_compound(struct page *head, >> set_page_count(page, 0); >> >> /* >> - * The first tail page stores compound_mapcount_ptr() and >> - * compound_order() and the second tail page stores >> - * compound_pincount_ptr(). Call prep_compound_head() after >> - * the first and second tail pages have been initialized to >> - * not have the data overwritten. >> + * The first tail page stores compound_mapcount_ptr(), >> + * compound_order() and compound_pincount_ptr(). Call >> + * prep_compound_head() after the first tail page have >> + * been initialized to not have the data overwritten. >> + * >> + * Note the idea to make this right after we initialize >> + * the offending tail pages is trying to take advantage >> + * of the likelihood of those tail struct pages being >> + * cached given that we will read them right after in >> + * prep_compound_head(). >> */ >> - if (pfn == head_pfn + 2) >> + if (unlikely(pfn == head_pfn + 1)) >> prep_compound_head(head, order); > > For me it is weird not to put this out of the loop. I saw the reason > is because of the caching suggested by Joao. But I think this is not > a hot path and putting it out of the loop may be more intuitive at least > for me. Maybe this optimization is unnecessary (maybe I am wrong). > And it will be consistent with prep_compound_page() (at least it does > not do the similar optimization) if we drop this optimization. This is also what I thought in the first version. :) > > Hi Joao, > > I am wondering is it a significant optimization for zone device memory? > I found this code existed from the 1st version you introduced. So > I suspect maybe you have some numbers, would you like to share with us? Those numbers would be really helpful. > > Thanks. Thanks! > > . >