Re: [PATCH v2] mm/page_alloc: minor clean up for memmap_init_compound()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Jun 11, 2022 at 10:13:52AM +0800, Miaohe Lin wrote:
> Since commit 5232c63f46fd ("mm: Make compound_pincount always available"),
> compound_pincount_ptr is stored at first tail page now. So we should call
> prep_compound_head() after the first tail page is initialized to take
> advantage of the likelihood of that tail struct page being cached given
> that we will read them right after in prep_compound_head().
> 
> Signed-off-by: Miaohe Lin <linmiaohe@xxxxxxxxxx>
> Cc: Joao Martins <joao.m.martins@xxxxxxxxxx>
> ---
> v2:
>   Don't move prep_compound_head() outside loop per Joao.
> ---
>  mm/page_alloc.c | 17 +++++++++++------
>  1 file changed, 11 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 4c7d99ee58b4..048df5d78add 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -6771,13 +6771,18 @@ static void __ref memmap_init_compound(struct page *head,
>  		set_page_count(page, 0);
>  
>  		/*
> -		 * The first tail page stores compound_mapcount_ptr() and
> -		 * compound_order() and the second tail page stores
> -		 * compound_pincount_ptr(). Call prep_compound_head() after
> -		 * the first and second tail pages have been initialized to
> -		 * not have the data overwritten.
> +		 * The first tail page stores compound_mapcount_ptr(),
> +		 * compound_order() and compound_pincount_ptr(). Call
> +		 * prep_compound_head() after the first tail page have
> +		 * been initialized to not have the data overwritten.
> +		 *
> +		 * Note the idea to make this right after we initialize
> +		 * the offending tail pages is trying to take advantage
> +		 * of the likelihood of those tail struct pages being
> +		 * cached given that we will read them right after in
> +		 * prep_compound_head().
>  		 */
> -		if (pfn == head_pfn + 2)
> +		if (unlikely(pfn == head_pfn + 1))
>  			prep_compound_head(head, order);

For me it is weird not to put this out of the loop. I saw the reason
is because of the caching suggested by Joao. But I think this is not
a hot path and putting it out of the loop may be more intuitive at least
for me.  Maybe this optimization is unnecessary (maybe I am wrong).
And it will be consistent with prep_compound_page() (at least it does
not do the similar optimization) if we drop this optimization.

Hi Joao,

I am wondering is it a significant optimization for zone device memory?
I found this code existed from the 1st version you introduced.  So
I suspect maybe you have some numbers, would you like to share with us?

Thanks.





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux