On Fri, Sep 18, 2020 at 11:35:02AM +0200, Oscar Salvador wrote: > On Tue, Sep 08, 2020 at 09:05:56AM +0000, HORIGUCHI NAOYA(堀口 直也) wrote: > > > > thp just after passing over the following block: > > > > > > > > > > > > > > > > > 1408 if (PageTransHuge(hpage)) { > > > > 1409 if (try_to_split_thp_page(p, "Memory Failure") < 0) { > > > > 1410 action_result(pfn, MF_MSG_UNSPLIT_THP, MF_IGNORED); > > > > 1411 return -EBUSY; > > > > 1412 } > > > > 1413 VM_BUG_ON_PAGE(!page_count(p), p); > > > > 1414 } > > > > > > > > So I feel that some check might be added after holding page lock to avoid > > > > that case. Or acutally, it might better that moving the above block into > > > > page lock is more better for simpler code. > > > > > > I will have a look at this. > > > > Thank you! > > Hi Naoya, > > I have been taking a look at this, and unless I am missing something obvious I > do not think that a new THP (containing the page) can be collapsed under us: > > We do take a refcount on the page by means of get_hwpoison_page. > We could only have done that if the page was mapped, so its refcount was already > above 0. > > Then we split the THP, and the refcount/mapcount go to the page we are trying to > poison. > At this point the page should add least have refcount > 1 mapcount >= 1. > > After that, let us assume that a new THP is trying to be collapsed by means of > khugepaged thread or madvise MADV_HUGEPAGE. > > khugepaged_scan_pmd() scans all ptes from [pte#0..pte#511] to see if they can > be collapsed, and one of the things it does is checking the page's refcount/ > mappcount by calling is_refcount_suitable(). > > expected_refcount = total_mapcount(page); > return page_count(page) == expected_refcount; > > We do have an extra pin from memory_failure, so this is going to fail because > > page: refcount = 2 , mapcount = 1 > > Beware that the page must sitll be mapped somehow, otherwise the PageLRU check > from above should have failed with the same result: > > if (!PageLRU(page)) { > result = SCAN_PAGE_LRU; > goto out_unmap; > } > > So, I do not think the page can be collapsed into a new THP after we have split > it here, but as I said, I might be missing something. This logic sounds convincing to me, or another possibility like conversion to other types of compound_page (like slab) is also prevented due to the refcount. The MF_MSG_DIFFERENT_COMPOUND path was originally introduced heuristically based on error report in stress testing, and the mechanism of the problem was unclear. Thanks, Naoya Horiguchi