On Tue, 12 May 2015 09:46:47 +0000 Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx> wrote: > memory_failrue() can run in 2 different mode (specified by MF_COUNT_INCREASED) > in page refcount perspective. When MF_COUNT_INCREASED is set, memory_failrue() > assumes that the caller takes a refcount of the target page. And if cleared, > memory_failure() takes it in it's own. > > In current code, however, refcounting is done differently in each caller. For > example, madvise_hwpoison() uses get_user_pages_fast() and hwpoison_inject() > uses get_page_unless_zero(). So this inconsistent refcounting causes refcount > failure especially for thp tail pages. Typical user visible effects are like > memory leak or VM_BUG_ON_PAGE(!page_count(page)) in isolate_lru_page(). > > To fix this refcounting issue, this patch introduces get_hwpoison_page() to > handle thp tail pages in the same manner for each caller of hwpoison code. > > There's a non-trivial change around unpoisoning, which now returns immediately > for thp with "MCE: Memory failure is now running on %#lx\n" message. This is > not right when split_huge_page() fails. So this patch also allows > unpoison_memory() to handle thp. > > ... > > /* > + * Get refcount for memory error handling: > + * - @page: raw page > + */ > +inline int get_hwpoison_page(struct page *page) > +{ > + struct page *head = compound_head(page); > + > + if (PageHuge(head)) > + return get_page_unless_zero(head); > + else if (PageTransHuge(head)) > + if (get_page_unless_zero(head)) { > + if (PageTail(page)) > + get_page(page); > + return 1; > + } else { > + return 0; > + } > + else > + return get_page_unless_zero(page); > +} This function is a bit weird. - The comment looks like kerneldoc but isn't kerneldoc - Why the inline? It isn't fastpath? - The layout is rather painful. It could be if (PageHuge(head)) return get_page_unless_zero(head); if (PageTransHuge(head)) { if (get_page_unless_zero(head)) { if (PageTail(page)) get_page(page); return 1; } else { return 0; } } return get_page_unless_zero(page); - Some illuminating comments would be nice. In particular that code path where it grabs a ref on the tail page as well as on the head page. What's going on there? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>