> Subject: [PATCH v2] mm: clean up hwpoison page cache page in fault path At first scan I thought this was a code cleanup. I think I'll do s/clean up/invalidate/. On Sat, 12 Feb 2022 21:37:40 -0500 Rik van Riel <riel@xxxxxxxxxxx> wrote: > Sometimes the page offlining code can leave behind a hwpoisoned clean > page cache page. Is this correct behaviour? > This can lead to programs being killed over and over > and over again as they fault in the hwpoisoned page, get killed, and > then get re-spawned by whatever wanted to run them. > > This is particularly embarrassing when the page was offlined due to > having too many corrected memory errors. Now we are killing tasks > due to them trying to access memory that probably isn't even corrupted. > > This problem can be avoided by invalidating the page from the page > fault handler, which already has a branch for dealing with these > kinds of pages. With this patch we simply pretend the page fault > was successful if the page was invalidated, return to userspace, > incur another page fault, read in the file from disk (to a new > memory page), and then everything works again. Is this worth a cc:stable?