On Sun, Apr 10, 2022 at 11:22:34PM +0800, Xu Yu wrote: > Kernel panic when injecting memory_failure for the global huge_zero_page, > when CONFIG_DEBUG_VM is enabled, as follows. ... > In fact, huge_zero_page is unhandlable currently in either soft offline > or memory failure injection. With CONFIG_DEBUG_VM disabled, > huge_zero_page is bailed out when checking HWPoisonHandlable() in > get_any_page(), or checking page mapping in split_huge_page_to_list(). > > This makes huge_zero_page bail out early in madvise_inject_error(), and > panic above won't happen again. I would not special case this in madvise_inject_error() but rather handle it in memory-failure code. We do already have HWPoisonHandlable(), which tells us whether the page is of a type we can really do something about, so why not add another check in HWPoisonHandlable() for huge_zero_page(), and have that checked in memory_failure(). Something like (untested): diff --git a/mm/memory-failure.c b/mm/memory-failure.c index dcb6bb9cf731..dccd0503f803 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1181,6 +1181,10 @@ static inline bool HWPoisonHandlable(struct page *page, unsigned long flags) { bool movable = false; + /* Can't handle huge_zero_page() */ + if(is_huge_zero_page(compound_head(page))) + return false; + /* Soft offline could mirgate non-LRU movable pages */ if ((flags & MF_SOFT_OFFLINE) && __PageMovable(page)) movable = true; @@ -1796,6 +1800,10 @@ int memory_failure(unsigned long pfn, int flags) res = -EBUSY; goto unlock_mutex; } + } else if(!HWPoisonHandable(p)) { + action_result(pfn, MF_MSG_UNKNOWN, MF_IGNORED); + res = -EBUSY; + goto unlock_mutex; } if (PageTransHuge(hpage)) { It can certainly be prettier, but you can get the idea. > > Reported-by: Abaci <abaci@xxxxxxxxxxxxxxxxx> > Signed-off-by: Xu Yu <xuyu@xxxxxxxxxxxxxxxxx> > --- > mm/madvise.c | 13 +++++++++++-- > 1 file changed, 11 insertions(+), 2 deletions(-) > > diff --git a/mm/madvise.c b/mm/madvise.c > index 1873616a37d2..03ad50d222e0 100644 > --- a/mm/madvise.c > +++ b/mm/madvise.c > @@ -1079,7 +1079,7 @@ static int madvise_inject_error(int behavior, > > for (; start < end; start += size) { > unsigned long pfn; > - struct page *page; > + struct page *page, *head; > int ret; > > ret = get_user_pages_fast(start, 1, 0, &page); > @@ -1087,12 +1087,21 @@ static int madvise_inject_error(int behavior, > return ret; > pfn = page_to_pfn(page); > > + head = compound_head(page); > + if (unlikely(is_huge_zero_page(head))) { > + pr_warn("Unhandlable attempt to %s pfn %#lx at process virtual address %#lx\n", > + behavior == MADV_SOFT_OFFLINE ? "soft offline" : > + "inject memory failure for", > + pfn, start); > + return -EINVAL; > + } > + > /* > * When soft offlining hugepages, after migrating the page > * we dissolve it, therefore in the second loop "page" will > * no longer be a compound page. > */ > - size = page_size(compound_head(page)); > + size = page_size(head); > > if (behavior == MADV_SOFT_OFFLINE) { > pr_info("Soft offlining pfn %#lx at process virtual address %#lx\n", > -- > 2.20.1.2432.ga663e714 > > -- Oscar Salvador SUSE Labs