Re: + mm-wrap-__find_buddy_pfn-with-a-necessary-buddy-page-validation.patch added to -mm tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> From: Zi Yan <ziy@xxxxxxxxxx>
> Subject: mm: wrap __find_buddy_pfn() with a necessary buddy page validation
> 
> Whenever the buddy of a page is found from __find_buddy_pfn(),
> page_is_buddy() should be used to check its validity. Add a helper
> function find_buddy_page_pfn() to find the buddy page and do the check
> together.
> 
> Link: https://lkml.kernel.org/r/20220401230804.1658207-2-zi.yan@xxxxxxxx
> Link: https://lore.kernel.org/linux-mm/CAHk-=wji_AmYygZMTsPMdJ7XksMt7kOur8oDfDdniBRMjm4VkQ@xxxxxxxxxxxxxx/
> Suggested-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
> Signed-off-by: Zi Yan <ziy@xxxxxxxxxx>
> Acked-by: Vlastimil Babka <vbabka@xxxxxxx>
> Cc: Steven Rostedt (Google) <rostedt@xxxxxxxxxxx>
> Cc: David Hildenbrand <david@xxxxxxxxxx>
> Cc: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
> Cc: Mike Rapoport <rppt@xxxxxxxxxx>
> Cc: Oscar Salvador <osalvador@xxxxxxx>
> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>

Sorry for replying here, I hope Mimecast stops messing up my mailbox
soon. Only some nits.

In general, LGTM

Acked-by: David Hildenbrand <david@xxxxxxxxxx>



> ---
> 
>  mm/internal.h       |  117 ++++++++++++++++++++++++++++++++----------
>  mm/page_alloc.c     |   52 ++----------------
>  mm/page_isolation.c |    9 +--
>  3 files changed, 101 insertions(+), 77 deletions(-)
> 
> --- a/mm/internal.h~mm-wrap-__find_buddy_pfn-with-a-necessary-buddy-page-validation
> +++ a/mm/internal.h
> @@ -212,6 +212,67 @@ struct alloc_context {
>  };
>  
>  /*
> + * This function returns the order of a free page in the buddy system. In
> + * general, page_zone(page)->lock must be held by the caller to prevent the
> + * page from being allocated in parallel and returning garbage as the order.
> + * If a caller does not hold page_zone(page)->lock, it must guarantee that the
> + * page cannot be allocated or merged in parallel. Alternatively, it must
> + * handle invalid values gracefully, and use buddy_order_unsafe() below.
> + */
> +static inline unsigned int buddy_order(struct page *page)
> +{
> +	/* PageBuddy() must be checked by the caller */
> +	return page_private(page);
> +}
> +
> +/*
> + * Like buddy_order(), but for callers who cannot afford to hold the zone lock.
> + * PageBuddy() should be checked first by the caller to minimize race window,
> + * and invalid values must be handled gracefully.
> + *
> + * READ_ONCE is used so that if the caller assigns the result into a local
> + * variable and e.g. tests it for valid range before using, the compiler cannot
> + * decide to remove the variable and inline the page_private(page) multiple
> + * times, potentially observing different values in the tests and the actual
> + * use of the result.
> + */
> +#define buddy_order_unsafe(page)	READ_ONCE(page_private(page))
> +
> +/*
> + * This function checks whether a page is free && is the buddy
> + * we can coalesce a page and its buddy if
> + * (a) the buddy is not in a hole (check before calling!) &&
> + * (b) the buddy is in the buddy system &&
> + * (c) a page and its buddy have the same order &&
> + * (d) a page and its buddy are in the same zone.
> + *
> + * For recording whether a page is in the buddy system, we set PageBuddy.
> + * Setting, clearing, and testing PageBuddy is serialized by zone->lock.
> + *
> + * For recording page's order, we use page_private(page).
> + */
> +static inline bool page_is_buddy(struct page *page, struct page *buddy,
> +							unsigned int order)

Nit: Strange indentation of the last parameter. But it's already in the
code you're moving, so ...

> +{
> +	if (!page_is_guard(buddy) && !PageBuddy(buddy))
> +		return false;
> +
> +	if (buddy_order(buddy) != order)
> +		return false;
> +
> +	/*
> +	 * zone check is done late to avoid uselessly calculating
> +	 * zone/node ids for pages that could never merge.

I know, this is from existing code. But there really isn't a node check
when relying on the zone id. I hope we are not missing a node check :)

> +	 */
> +	if (page_zone_id(page) != page_zone_id(buddy))
> +		return false;
> +
> +	VM_BUG_ON_PAGE(page_count(buddy) != 0, buddy);
> +
> +	return true;
> +}
> +
> +/*
>   * Locate the struct page for both the matching buddy in our
>   * pair (buddy1) and the combined O(n+1) page they form (page).
>   *
> @@ -234,6 +295,35 @@ __find_buddy_pfn(unsigned long page_pfn,
>  	return page_pfn ^ (1 << order);
>  }
>  
> +/*
> + * Find the buddy of @page and validate it.
> + * @page: The input page
> + * @pfn: The pfn of the page, it saves a call to page_to_pfn() when the
> + *       function is used in the performance-critical __free_one_page().
> + * @order: The order of the page
> + * @buddy_pfn: The output pointer to the buddy pfn, it also saves a call to
> + *             page_to_pfn().
> + *
> + * The found buddy can be a non PageBuddy, out of @page's zone, or its order is
> + * not the same as @page. The validation is necessary before use it.
> + *
> + * Return: the found buddy page or NULL if not found.
> + */
> +static inline struct page *find_buddy_page_pfn(struct page *page,
> +			unsigned long pfn, unsigned int order, unsigned long *buddy_pfn)
> +{
> +	struct page *buddy;
> +	unsigned long __buddy_pfn = __find_buddy_pfn(pfn, order);

Nit: reverse christmas tree

> +
> +	buddy = page + (__buddy_pfn - pfn);
> +	if (buddy_pfn)
> +		*buddy_pfn = __buddy_pfn;
> +
> +	if (page_is_buddy(page, buddy, order))
> +		return buddy;
> +	return NULL;
> +}
> +

[...]
>  #ifdef CONFIG_COMPACTION
>  static inline struct capture_control *task_capc(struct zone *zone)
>  {
> @@ -1010,18 +976,17 @@ static inline bool
>  buddy_merge_likely(unsigned long pfn, unsigned long buddy_pfn,
>  		   struct page *page, unsigned int order)
>  {
> -	struct page *higher_page, *higher_buddy;
> -	unsigned long combined_pfn;
> +	struct page *higher_page;
> +	unsigned long higher_page_pfn;

Nit: reverse christmas tree



-- 
Thanks,

David / dhildenb




[Index of Archives]     [Kernel Archive]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]

  Powered by Linux