On Thu, 26 Jun 2014 15:50:36 -0400 Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx> wrote: > > index 90002ea..e277726a 100644 > > --- a/mm/memory-failure.c > > +++ b/mm/memory-failure.c > > @@ -1143,6 +1143,22 @@ int memory_failure(unsigned long pfn, int trapno, int flags) > > lock_page(hpage); > > > > /* > > + * The page could have turned into a non LRU page or > > + * changed compound pages during the locking. > > + * If this happens just bail out. > > + */ > > + if (compound_head(p) != hpage) { > > + action_result(pfn, "different compound page after locking", IGNORED); > > + res = -EBUSY; > > + goto out; > > + } > > This is a useful check. > > > + if (!PageLRU(hpage)) { > > + action_result(pfn, "non LRU after locking", IGNORED); > > + res = -EBUSY; > > + goto out; > > + } > > I think this makes sense in v3.14, but maybe redundant if the patch "hwpoison: > fix the handling path of the victimized page frame that belong to non-LRU" > from Chen Yucong is merged into mainline (now it's in linux-mmotm). Andi, can you please check that and test? If the patch is good I'll bump it into 3.16 with an enhanced changelog.. From: Chen Yucong <slaoub@xxxxxxxxx> Subject: hwpoison: fix the handling path of the victimized page frame that belong to non-LRU Until now, the kernel has the same policy to handle victimized page frames that belong to kernel-space(reserved/slab-subsystem) or non-LRU(unknown page state). In other word, the result of handling either of these victimized page frames is (IGNORED | FAILED), and the return value of memory_failure() is -EBUSY. This patch is to avoid that memory_failure() returns very soon due to the "true" value of (!PageLRU(p)), and it also ensures that action_result() can report more precise information("reserved kernel", "kernel slab", and "unknown page state") instead of "non LRU", especially for memory errors which are detected by memory-scrubbing. Signed-off-by: Chen Yucong <slaoub@xxxxxxxxx> Acked-by: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- mm/memory-failure.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff -puN mm/memory-failure.c~hwpoison-fix-the-handling-path-of-the-victimized-page-frame-that-belong-to-non-lur mm/memory-failure.c --- a/mm/memory-failure.c~hwpoison-fix-the-handling-path-of-the-victimized-page-frame-that-belong-to-non-lur +++ a/mm/memory-failure.c @@ -895,7 +895,7 @@ static int hwpoison_user_mappings(struct struct page *hpage = *hpagep; struct page *ppage; - if (PageReserved(p) || PageSlab(p)) + if (PageReserved(p) || PageSlab(p) || !PageLRU(p)) return SWAP_SUCCESS; /* @@ -1159,9 +1159,6 @@ int memory_failure(unsigned long pfn, in action_result(pfn, "free buddy, 2nd try", DELAYED); return 0; } - action_result(pfn, "non LRU", IGNORED); - put_page(p); - return -EBUSY; } } @@ -1194,6 +1191,9 @@ int memory_failure(unsigned long pfn, in return 0; } + if (!PageHuge(p) && !PageTransTail(p) && !PageLRU(p)) + goto identify_page_state; + /* * For error on the tail page, we should set PG_hwpoison * on the head page to show that the hugepage is hwpoisoned @@ -1243,6 +1243,7 @@ int memory_failure(unsigned long pfn, in goto out; } +identify_page_state: res = -EBUSY; /* * The first check uses the current page flags which may not have any _ -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>