On Thu, Oct 17, 2019 at 04:21:17PM +0200, Oscar Salvador wrote: > When trying to soft-offline a free page, we need to first take it off > the buddy allocator. > Once we know is out of reach, we can safely flag it as poisoned. > > take_page_off_buddy will be used to take a page meant to be poisoned > off the buddy allocator. > take_page_off_buddy calls break_down_buddy_pages, which splits a > higher-order page in case our page belongs to one. > > Once the page is under our control, we call page_set_poison to set it I guess you mean page_handle_poison here. > as poisoned and grab a refcount on it. > > Signed-off-by: Oscar Salvador <osalvador@xxxxxxx> > --- > mm/memory-failure.c | 20 +++++++++++----- > mm/page_alloc.c | 68 +++++++++++++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 82 insertions(+), 6 deletions(-) > > diff --git a/mm/memory-failure.c b/mm/memory-failure.c > index 37b230b8cfe7..1d986580522d 100644 > --- a/mm/memory-failure.c > +++ b/mm/memory-failure.c > @@ -78,6 +78,15 @@ EXPORT_SYMBOL_GPL(hwpoison_filter_dev_minor); > EXPORT_SYMBOL_GPL(hwpoison_filter_flags_mask); > EXPORT_SYMBOL_GPL(hwpoison_filter_flags_value); > > +extern bool take_page_off_buddy(struct page *page); > + > +static void page_handle_poison(struct page *page) hwpoison is a separate idea from page poisoning, so maybe I think it's better to be named like page_handle_hwpoison(). > +{ > + SetPageHWPoison(page); > + page_ref_inc(page); > + num_poisoned_pages_inc(); > +} > + > static int hwpoison_filter_dev(struct page *p) > { > struct address_space *mapping; > @@ -1830,14 +1839,13 @@ static int soft_offline_in_use_page(struct page *page) > > static int soft_offline_free_page(struct page *page) > { > - int rc = dissolve_free_huge_page(page); > + int rc = -EBUSY; > > - if (!rc) { > - if (set_hwpoison_free_buddy_page(page)) > - num_poisoned_pages_inc(); > - else > - rc = -EBUSY; > + if (!dissolve_free_huge_page(page) && take_page_off_buddy(page)) { > + page_handle_poison(page); > + rc = 0; > } > + > return rc; > } > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index cd1dd0712624..255df0c76a40 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -8632,6 +8632,74 @@ bool is_free_buddy_page(struct page *page) > > #ifdef CONFIG_MEMORY_FAILURE > /* > + * Break down a higher-order page in sub-pages, and keep our target out of > + * buddy allocator. > + */ > +static void break_down_buddy_pages(struct zone *zone, struct page *page, > + struct page *target, int low, int high, > + struct free_area *area, int migratetype) > +{ > + unsigned long size = 1 << high; > + struct page *current_buddy, *next_page; > + > + while (high > low) { > + area--; > + high--; > + size >>= 1; > + > + if (target >= &page[size]) { > + next_page = page + size; > + current_buddy = page; > + } else { > + next_page = page; > + current_buddy = page + size; > + } > + > + if (set_page_guard(zone, current_buddy, high, migratetype)) > + continue; > + > + if (current_buddy != target) { > + add_to_free_area(current_buddy, area, migratetype); > + set_page_order(current_buddy, high); > + page = next_page; > + } > + } > +} > + > +/* > + * Take a page that will be marked as poisoned off the buddy allocator. > + */ > +bool take_page_off_buddy(struct page *page) > + { > + struct zone *zone = page_zone(page); > + unsigned long pfn = page_to_pfn(page); > + unsigned long flags; > + unsigned int order; > + bool ret = false; > + > + spin_lock_irqsave(&zone->lock, flags); > + for (order = 0; order < MAX_ORDER; order++) { > + struct page *page_head = page - (pfn & ((1 << order) - 1)); > + int buddy_order = page_order(page_head); > + struct free_area *area = &(zone->free_area[buddy_order]); > + > + if (PageBuddy(page_head) && buddy_order >= order) { > + unsigned long pfn_head = page_to_pfn(page_head); > + int migratetype = get_pfnblock_migratetype(page_head, > + pfn_head); > + > + del_page_from_free_area(page_head, area); > + break_down_buddy_pages(zone, page_head, page, 0, > + buddy_order, area, migratetype); > + ret = true; > + break; indent with whitespace? And you can find a few more coding style warning with checkpatch.pl. BTW, if we consider to make unpoison mechanism to keep up with the new semantics, we will need the reverse operation of take_page_off_buddy(). Do you think that that part will come with a separate work? Thanks, Naoya Horiguchi > + } > + } > + spin_unlock_irqrestore(&zone->lock, flags); > + return ret; > + } > + > +/* > * Set PG_hwpoison flag if a given page is confirmed to be a free page. This > * test is performed under the zone lock to prevent a race against page > * allocation. > -- > 2.12.3 > >