The patch titled Subject: mm: invalidate hwpoison page cache page in fault path has been added to the -mm tree. Its filename is mm-clean-up-hwpoison-page-cache-page-in-fault-path.patch This patch should soon appear at https://ozlabs.org/~akpm/mmots/broken-out/mm-clean-up-hwpoison-page-cache-page-in-fault-path.patch and later at https://ozlabs.org/~akpm/mmotm/broken-out/mm-clean-up-hwpoison-page-cache-page-in-fault-path.patch Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: Rik van Riel <riel@xxxxxxxxxxx> Subject: mm: invalidate hwpoison page cache page in fault path Sometimes the page offlining code can leave behind a hwpoisoned clean page cache page. This can lead to programs being killed over and over and over again as they fault in the hwpoisoned page, get killed, and then get re-spawned by whatever wanted to run them. This is particularly embarrassing when the page was offlined due to having too many corrected memory errors. Now we are killing tasks due to them trying to access memory that probably isn't even corrupted. This problem can be avoided by invalidating the page from the page fault handler, which already has a branch for dealing with these kinds of pages. With this patch we simply pretend the page fault was successful if the page was invalidated, return to userspace, incur another page fault, read in the file from disk (to a new memory page), and then everything works again. Link: https://lkml.kernel.org/r/20220212213740.423efcea@xxxxxxxxxxxxxxxxxxxx Signed-off-by: Rik van Riel <riel@xxxxxxxxxxx> Reviewed-by: Miaohe Lin <linmiaohe@xxxxxxxxxx> Cc: Mel Gorman <mgorman@xxxxxxx> Cc: Johannes Weiner <hannes@xxxxxxxxxxx> Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx> Cc: Naoya Horiguchi <naoya.horiguchi@xxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- mm/memory.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) --- a/mm/memory.c~mm-clean-up-hwpoison-page-cache-page-in-fault-path +++ a/mm/memory.c @@ -3926,11 +3926,16 @@ static vm_fault_t __do_fault(struct vm_f return ret; if (unlikely(PageHWPoison(vmf->page))) { - if (ret & VM_FAULT_LOCKED) + vm_fault_t poisonret = VM_FAULT_HWPOISON; + if (ret & VM_FAULT_LOCKED) { + /* Retry if a clean page was removed from the cache. */ + if (invalidate_inode_page(vmf->page)) + poisonret = 0; unlock_page(vmf->page); + } put_page(vmf->page); vmf->page = NULL; - return VM_FAULT_HWPOISON; + return poisonret; } if (unlikely(!(ret & VM_FAULT_LOCKED))) _ Patches currently in -mm which might be from riel@xxxxxxxxxxx are mm-clean-up-hwpoison-page-cache-page-in-fault-path.patch