>On Fri, Sep 18, 2020 at 11:35:02AM +0200, Oscar Salvador wrote: > On Tue, Sep 08, 2020 at 09:05:56AM +0000, HORIGUCHI NAOYA(堀口 直也) wrote: > > > > thp just after passing over the following block: > > > > > > > > > > > > > > > > > 1408 if (PageTransHuge(hpage)) { > > > > 1409 if (try_to_split_thp_page(p, "Memory Failure") < 0) { > > > > 1410 action_result(pfn, MF_MSG_UNSPLIT_THP, MF_IGNORED); > > > > 1411 return -EBUSY; > > > > 1412 } > > > > 1413 VM_BUG_ON_PAGE(!page_count(p), p); > > > > 1414 } > > > > > > > > So I feel that some check might be added after holding page lock > > > > to avoid that case. Or acutally, it might better that moving the > > > > above block into page lock is more better for simpler code. > > > > > > I will have a look at this. > > > > Thank you! > > Hi Naoya, > > I have been taking a look at this, and unless I am missing something > obvious I do not think that a new THP (containing the page) can be collapsed under us: > > We do take a refcount on the page by means of get_hwpoison_page. > We could only have done that if the page was mapped, so its refcount > was already above 0. > > Then we split the THP, and the refcount/mapcount go to the page we are > trying to poison. > At this point the page should add least have refcount > 1 mapcount >= 1. > > After that, let us assume that a new THP is trying to be collapsed by > means of khugepaged thread or madvise MADV_HUGEPAGE. > > khugepaged_scan_pmd() scans all ptes from [pte#0..pte#511] to see if > they can be collapsed, and one of the things it does is checking the > page's refcount/ mappcount by calling is_refcount_suitable(). > > expected_refcount = total_mapcount(page); > return page_count(page) == expected_refcount; > > We do have an extra pin from memory_failure, so this is going to fail > because > > page: refcount = 2 , mapcount = 1 > > Beware that the page must sitll be mapped somehow, otherwise the > PageLRU check from above should have failed with the same result: > > if (!PageLRU(page)) { > result = SCAN_PAGE_LRU; > goto out_unmap; > } > > So, I do not think the page can be collapsed into a new THP after we > have split it here, but as I said, I might be missing something. > >This logic sounds convincing to me, or another possibility like conversion >to other types of compound_page (like slab) is also prevented due to the refcount. > >The MF_MSG_DIFFERENT_COMPOUND path was originally introduced heuristically >based on error report in stress testing, and the mechanism of the problem was unclear. > >Thanks, >Naoya Horiguchi There is another check for hwpoision status in this function: /* * unpoison always clear PG_hwpoison inside page lock */ if (!PageHWPoison(p)) { pr_err("Memory failure: %#lx: just unpoisoned\n", pfn); num_poisoned_pages_dec(); unlock_page(p); put_hwpoison_page(p); return 0; } I think the check here and the check for MF_MSG_DIFFERENT_COMPOUND are both for stress test purpose, In stress test scenario, the page may be unpoisioned and be reallocted meanwhile, so the code here and previously talked check really does some meaningful ckecking for test, which will not happen in real cases,As the poision page will not be unposioned for normal memory. Thanks Aili Yao