On Tue, Feb 21, 2023 at 05:59:05PM +0900, Naoya Horiguchi wrote: > After a memory error happens on a clean folio, a process unexpectedly > receives SIGBUS when it accesses to the error page. This SIGBUS killing > is pointless and simply degrades the level of RAS of the system, because > the clean folio can be dropped without any data lost on memory error > handling as we do for a clean pagecache. > > When memory_failure() is called on a clean folio, try_to_unmap() is called > twice (one from split_huge_page() and one from hwpoison_user_mappings()). > The root cause of the issue is that pte conversion to hwpoisoned entry is > now done in the first call of try_to_unmap() because PageHWPoison is already > set at this point, while it's actually expected to be done in the second > call. This behavior disturbs the error handling operation like removing > pagecache, which results in the malfunction described above. > > So convert TTU_IGNORE_HWPOISON into TTU_HWPOISON and set TTU_HWPOISON only > when we really intend to convert pte to hwpoison entry. This can prevent > other callers of try_to_unmap() from accidentally converting to hwpoison > entries. > > Fixes: a42634a6c07d ("readahead: Use a folio in read_pages()") How did you choose this Fixes tag?