On Thu, Apr 11, 2024 at 05:00:33PM +0800, Miaohe Lin wrote: > But as code changes, the above page lock shift is gone. And I think below logic can't > trigger now. As we hold extra page refcnt so page can't be coallesced into a new THP or Slab page. > > /* > * We're only intended to deal with the non-Compound page here. > * However, the page could have changed compound pages due to > * race window. If this happens, we could try again to hopefully > * handle the page next round. > */ > if (PageCompound(p)) { > if (retry) { > ClearPageHWPoison(p); > unlock_page(p); > put_page(p); > flags &= ~MF_COUNT_INCREASED; > retry = false; > goto try_again; > } > res = action_result(pfn, MF_MSG_DIFFERENT_COMPOUND, MF_IGNORED); > goto unlock_page; > } > > So it might be better to replace above code block as WARN_ON(PageCompound(p)) and remove MF_MSG_DIFFERENT_COMPOUND case. > Any thoughts? Yes, I think you're right. As the MM handling of pages has evolved, people haven't kept memory-failure uptodate. That's both understandable and regrettable. I don't have the time to focus on memory-failure myself; I have a couple of hundred uses of page->mapping to eliminate. And I'd want to get a lot more serious about testing before starting on that journey. I do have ideas for handling hwpoison without splitting a folio. I'd also really like to improve memory-failure to handle sub-page-size blast radius (Intel CXL used to have a blast radius of 256 bytes). But realistically, these things are never going to rise high enough on my todo list to actually get done.