Re: [PATCH v5 08/11] mm: hwpoison: soft offline supports thp migration

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Naoya Horiguchi wrote:
> On Fri, Apr 21, 2017 at 10:55:49AM -0500, Zi Yan wrote:
>>
>> Anshuman Khandual wrote:
>>> On 04/21/2017 02:17 AM, Zi Yan wrote:
>>>> From: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx>
>>>>
>>>> This patch enables thp migration for soft offline.
>>>>
>>>> Signed-off-by: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx>
>>>>
>>>> ChangeLog: v1 -> v5:
>>>> - fix page isolation counting error
>>>>
>>>> Signed-off-by: Zi Yan <zi.yan@xxxxxxxxxxxxxx>
>>>> ---
>>>>  mm/memory-failure.c | 35 ++++++++++++++---------------------
>>>>  1 file changed, 14 insertions(+), 21 deletions(-)
>>>>
>>>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
>>>> index 9b77476ef31f..23ff02eb3ed4 100644
>>>> --- a/mm/memory-failure.c
>>>> +++ b/mm/memory-failure.c
>>>> @@ -1481,7 +1481,17 @@ static struct page *new_page(struct page *p, unsigned long private, int **x)
>>>>  	if (PageHuge(p))
>>>>  		return alloc_huge_page_node(page_hstate(compound_head(p)),
>>>>  						   nid);
>>>> -	else
>>>> +	else if (thp_migration_supported() && PageTransHuge(p)) {
>>>> +		struct page *thp;
>>>> +
>>>> +		thp = alloc_pages_node(nid,
>>>> +			(GFP_TRANSHUGE | __GFP_THISNODE) & ~__GFP_RECLAIM,
>>> Why not __GFP_RECLAIM ? Its soft offline path we wait a bit before
>>> declaring that THP page cannot be allocated and hence should invoke
>>> reclaim methods as well.
>> I am not sure how much effort the kernel wants to put here to soft
>> offline a THP. Naoya knows more here.
> 
> What I thought at first was that soft offline is not an urgent user
> and no need to reclaim (i.e. give a little impact on other thread.)
> But that's not a strong opinion, so if you like __GFP_RECLAIM here,
> I'm fine about that.

OK, I will add __GFP_RECLAIM.

> 
>>
>>>> +			HPAGE_PMD_ORDER);
>>>> +		if (!thp)
>>>> +			return NULL;
>>>> +		prep_transhuge_page(thp);
>>>> +		return thp;
>>>> +	} else
>>>>  		return __alloc_pages_node(nid, GFP_HIGHUSER_MOVABLE, 0);
>>>>  }
>>>>  
>>>> @@ -1665,8 +1675,8 @@ static int __soft_offline_page(struct page *page, int flags)
>>>>  		 * cannot have PAGE_MAPPING_MOVABLE.
>>>>  		 */
>>>>  		if (!__PageMovable(page))
>>>> -			inc_node_page_state(page, NR_ISOLATED_ANON +
>>>> -						page_is_file_cache(page));
>>>> +			mod_node_page_state(page_pgdat(page), NR_ISOLATED_ANON +
>>>> +						page_is_file_cache(page), hpage_nr_pages(page));
>>>>  		list_add(&page->lru, &pagelist);
>>>>  		ret = migrate_pages(&pagelist, new_page, NULL, MPOL_MF_MOVE_ALL,
>>>>  					MIGRATE_SYNC, MR_MEMORY_FAILURE);
>>>> @@ -1689,28 +1699,11 @@ static int __soft_offline_page(struct page *page, int flags)
>>>>  static int soft_offline_in_use_page(struct page *page, int flags)
>>>>  {
>>>>  	int ret;
>>>> -	struct page *hpage = compound_head(page);
>>>> -
>>>> -	if (!PageHuge(page) && PageTransHuge(hpage)) {
>>>> -		lock_page(hpage);
>>>> -		if (!PageAnon(hpage) || unlikely(split_huge_page(hpage))) {
>>>> -			unlock_page(hpage);
>>>> -			if (!PageAnon(hpage))
>>>> -				pr_info("soft offline: %#lx: non anonymous thp\n", page_to_pfn(page));
>>>> -			else
>>>> -				pr_info("soft offline: %#lx: thp split failed\n", page_to_pfn(page));
>>>> -			put_hwpoison_page(hpage);
>>>> -			return -EBUSY;
>>>> -		}
>>>> -		unlock_page(hpage);
>>>> -		get_hwpoison_page(page);
>>>> -		put_hwpoison_page(hpage);
>>>> -	}
>>>>  
>>>>  	if (PageHuge(page))
>>>>  		ret = soft_offline_huge_page(page, flags);
>>>>  	else
>>>> -		ret = __soft_offline_page(page, flags);
>>>> +		ret = __soft_offline_page(compound_head(page), flags);
>>> Hmm, what if the THP allocation fails in the new_page() path and
>>> we fallback for general page allocation. In that case we will
>>> always be still calling with the head page ? Because we dont
>>> split the huge page any more.
>> This could be a problem if the user wants to offline a TailPage but due
>> to THP allocation failure, the HeadPage is offlined.
> 
> Right, "retry with split" part is unfinished, so we need some improvement.
> 
>> It may be better to only soft offline THPs if page ==
>> compound_head(page). If page != compound_head(page), we still split THPs
>> like before.
>>
>> Because in migrate_pages(), we cannot guarantee any TailPages in that
>> THP are migrated (1. THP allocation failure causes THP splitting, then
>> only HeadPage is going to be migrated; 2. even if we change existing
>> migrate_pages() implementation to add all TailPages to migration list
>> instead of LRU list, we still cannot guarantee the TailPage we want to
>> migrate is migrated.).
>>
>> Naoya, what do you think?
> 
> Maybe soft offline is a special caller of page migration because it
> basically wants to migrate only one page, but thp migration still has
> a benefit because we can avoid thp split.
> So I like that we try thp migration at first, and if it fails we fall
> back to split and migrate (only) a raw error page. This should be done
> in caller side for soft offline, because it knows where the error page is.

Make sense. So when migrate_pages() sees the migrate reason is
MR_MEMORY_FAILURE, it will not split THP when newpage allocation fails.
Then, the soft offline caller will split failed THP and retry migrating
the error subpage. I can do that.

> 
> As for generic case (for other migration callers which mainly want to
> migrate multiple pages for their purpose,) thp split and retry can be
> done in common migration code. After thp split, all subpages are linked
> to migration list, then we retry without returning to the caller.
> So I think that split_huge_page() can be moved to (for example) for-loop
> in migrate_pages().
> 
> I tried to write a patch for it last year, but considering vm event
> accounting, the patch might be large (~100 lines).

Yes. I saw your code on your github. I can pick it up and send it for
review after this patchset is merged, if you are OK with it.


-- 
Best Regards,
Yan Zi

Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]
  Powered by Linux