> From: Zi Yan <ziy@xxxxxxxxxx> > Subject: mm: wrap __find_buddy_pfn() with a necessary buddy page validation > > Whenever the buddy of a page is found from __find_buddy_pfn(), > page_is_buddy() should be used to check its validity. Add a helper > function find_buddy_page_pfn() to find the buddy page and do the check > together. > > Link: https://lkml.kernel.org/r/20220401230804.1658207-2-zi.yan@xxxxxxxx > Link: https://lore.kernel.org/linux-mm/CAHk-=wji_AmYygZMTsPMdJ7XksMt7kOur8oDfDdniBRMjm4VkQ@xxxxxxxxxxxxxx/ > Suggested-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> > Signed-off-by: Zi Yan <ziy@xxxxxxxxxx> > Acked-by: Vlastimil Babka <vbabka@xxxxxxx> > Cc: Steven Rostedt (Google) <rostedt@xxxxxxxxxxx> > Cc: David Hildenbrand <david@xxxxxxxxxx> > Cc: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> > Cc: Mike Rapoport <rppt@xxxxxxxxxx> > Cc: Oscar Salvador <osalvador@xxxxxxx> > Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> Sorry for replying here, I hope Mimecast stops messing up my mailbox soon. Only some nits. In general, LGTM Acked-by: David Hildenbrand <david@xxxxxxxxxx> > --- > > mm/internal.h | 117 ++++++++++++++++++++++++++++++++---------- > mm/page_alloc.c | 52 ++---------------- > mm/page_isolation.c | 9 +-- > 3 files changed, 101 insertions(+), 77 deletions(-) > > --- a/mm/internal.h~mm-wrap-__find_buddy_pfn-with-a-necessary-buddy-page-validation > +++ a/mm/internal.h > @@ -212,6 +212,67 @@ struct alloc_context { > }; > > /* > + * This function returns the order of a free page in the buddy system. In > + * general, page_zone(page)->lock must be held by the caller to prevent the > + * page from being allocated in parallel and returning garbage as the order. > + * If a caller does not hold page_zone(page)->lock, it must guarantee that the > + * page cannot be allocated or merged in parallel. Alternatively, it must > + * handle invalid values gracefully, and use buddy_order_unsafe() below. > + */ > +static inline unsigned int buddy_order(struct page *page) > +{ > + /* PageBuddy() must be checked by the caller */ > + return page_private(page); > +} > + > +/* > + * Like buddy_order(), but for callers who cannot afford to hold the zone lock. > + * PageBuddy() should be checked first by the caller to minimize race window, > + * and invalid values must be handled gracefully. > + * > + * READ_ONCE is used so that if the caller assigns the result into a local > + * variable and e.g. tests it for valid range before using, the compiler cannot > + * decide to remove the variable and inline the page_private(page) multiple > + * times, potentially observing different values in the tests and the actual > + * use of the result. > + */ > +#define buddy_order_unsafe(page) READ_ONCE(page_private(page)) > + > +/* > + * This function checks whether a page is free && is the buddy > + * we can coalesce a page and its buddy if > + * (a) the buddy is not in a hole (check before calling!) && > + * (b) the buddy is in the buddy system && > + * (c) a page and its buddy have the same order && > + * (d) a page and its buddy are in the same zone. > + * > + * For recording whether a page is in the buddy system, we set PageBuddy. > + * Setting, clearing, and testing PageBuddy is serialized by zone->lock. > + * > + * For recording page's order, we use page_private(page). > + */ > +static inline bool page_is_buddy(struct page *page, struct page *buddy, > + unsigned int order) Nit: Strange indentation of the last parameter. But it's already in the code you're moving, so ... > +{ > + if (!page_is_guard(buddy) && !PageBuddy(buddy)) > + return false; > + > + if (buddy_order(buddy) != order) > + return false; > + > + /* > + * zone check is done late to avoid uselessly calculating > + * zone/node ids for pages that could never merge. I know, this is from existing code. But there really isn't a node check when relying on the zone id. I hope we are not missing a node check :) > + */ > + if (page_zone_id(page) != page_zone_id(buddy)) > + return false; > + > + VM_BUG_ON_PAGE(page_count(buddy) != 0, buddy); > + > + return true; > +} > + > +/* > * Locate the struct page for both the matching buddy in our > * pair (buddy1) and the combined O(n+1) page they form (page). > * > @@ -234,6 +295,35 @@ __find_buddy_pfn(unsigned long page_pfn, > return page_pfn ^ (1 << order); > } > > +/* > + * Find the buddy of @page and validate it. > + * @page: The input page > + * @pfn: The pfn of the page, it saves a call to page_to_pfn() when the > + * function is used in the performance-critical __free_one_page(). > + * @order: The order of the page > + * @buddy_pfn: The output pointer to the buddy pfn, it also saves a call to > + * page_to_pfn(). > + * > + * The found buddy can be a non PageBuddy, out of @page's zone, or its order is > + * not the same as @page. The validation is necessary before use it. > + * > + * Return: the found buddy page or NULL if not found. > + */ > +static inline struct page *find_buddy_page_pfn(struct page *page, > + unsigned long pfn, unsigned int order, unsigned long *buddy_pfn) > +{ > + struct page *buddy; > + unsigned long __buddy_pfn = __find_buddy_pfn(pfn, order); Nit: reverse christmas tree > + > + buddy = page + (__buddy_pfn - pfn); > + if (buddy_pfn) > + *buddy_pfn = __buddy_pfn; > + > + if (page_is_buddy(page, buddy, order)) > + return buddy; > + return NULL; > +} > + [...] > #ifdef CONFIG_COMPACTION > static inline struct capture_control *task_capc(struct zone *zone) > { > @@ -1010,18 +976,17 @@ static inline bool > buddy_merge_likely(unsigned long pfn, unsigned long buddy_pfn, > struct page *page, unsigned int order) > { > - struct page *higher_page, *higher_buddy; > - unsigned long combined_pfn; > + struct page *higher_page; > + unsigned long higher_page_pfn; Nit: reverse christmas tree -- Thanks, David / dhildenb