On 10/10/2018 06:13 PM, Zi Yan wrote: > On 10 Oct 2018, at 0:05, Anshuman Khandual wrote: > >> On 10/09/2018 07:28 PM, Zi Yan wrote: >>> cc: Naoya Horiguchi (who proposed to use !_PAGE_PRESENT && !_PAGE_PSE for x86 >>> PMD migration entry check) >>> >>> On 8 Oct 2018, at 23:58, Anshuman Khandual wrote: >>> >>>> A normal mapped THP page at PMD level should be correctly differentiated >>>> from a PMD migration entry while walking the page table. A mapped THP would >>>> additionally check positive for pmd_present() along with pmd_trans_huge() >>>> as compared to a PMD migration entry. This just adds a new conditional test >>>> differentiating the two while walking the page table. >>>> >>>> Fixes: 616b8371539a6 ("mm: thp: enable thp migration in generic path") >>>> Signed-off-by: Anshuman Khandual <anshuman.khandual@xxxxxxx> >>>> --- >>>> On X86, pmd_trans_huge() and is_pmd_migration_entry() are always mutually >>>> exclusive which makes the current conditional block work for both mapped >>>> and migration entries. This is not same with arm64 where pmd_trans_huge() >>> >>> !pmd_present() && pmd_trans_huge() is used to represent THPs under splitting, >> >> Not really if we just look at code in the conditional blocks. > > Yeah, I explained it wrong above. Sorry about that. > > In x86, pmd_present() checks (_PAGE_PRESENT | _PAGE_PROTNONE | _PAGE_PSE), > thus, it returns true even if the present bit is cleared but PSE bit is set. Okay. > This is done so, because THPs under splitting are regarded as present in the kernel > but not present when a hardware page table walker checks it. Okay. > > For PMD migration entry, which should be regarded as not present, if PSE bit > is set, which makes pmd_trans_huge() returns true, like ARM64 does, all > PMD migration entries will be regarded as present Okay to make pmd_present() return false pmd_trans_huge() has to return false as well. Is there anything which can be done to get around this problem on X86 ? pmd_trans_huge() returning true for a migration entry sounds logical. Otherwise we would revert the condition block order to accommodate both the implementation for pmd_trans_huge() as suggested by Kirill before or just consider this patch forward. Because I am not really sure yet about the idea of getting pmd_present() check into pmd_trans_huge() on arm64 just to make it fit into this semantics as suggested by Will. If a PMD is trans huge page or not should not depend on whether it is present or not. > > My concern is that if ARM64’s pmd_trans_huge() returns true for migration > entries, unlike x86, there might be bugs triggered in the kernel when > THP migration is enabled in ARM64. Right and that is exactly what we are trying to fix with this patch. > > Let me know if I explain this clear to you. > >> >>> since _PAGE_PRESENT is cleared during THP splitting but _PAGE_PSE is not. >>> See the comment in pmd_present() for x86, in arch/x86/include/asm/pgtable.h >> >> >> if (pmd_trans_huge(pmde) || is_pmd_migration_entry(pmde)) { >> pvmw->ptl = pmd_lock(mm, pvmw->pmd); >> if (likely(pmd_trans_huge(*pvmw->pmd))) { >> if (pvmw->flags & PVMW_MIGRATION) >> return not_found(pvmw); >> if (pmd_page(*pvmw->pmd) != page) >> return not_found(pvmw); >> return true; >> } else if (!pmd_present(*pvmw->pmd)) { >> if (thp_migration_supported()) { >> if (!(pvmw->flags & PVMW_MIGRATION)) >> return not_found(pvmw); >> if (is_migration_entry(pmd_to_swp_entry(*pvmw->pmd))) { >> swp_entry_t entry = pmd_to_swp_entry(*pvmw->pmd); >> >> if (migration_entry_to_page(entry) != page) >> return not_found(pvmw); >> return true; >> } >> } >> return not_found(pvmw); >> } else { >> /* THP pmd was split under us: handle on pte level */ >> spin_unlock(pvmw->ptl); >> pvmw->ptl = NULL; >> } >> } else if (!pmd_present(pmde)) { ---> Outer 'else if' >> return false; >> } >> >> Looking at the above code, it seems the conditional check for a THP >> splitting case would be (!pmd_trans_huge && pmd_present) instead as >> it has skipped the first two conditions. But THP splitting must have >> been initiated once it has cleared the outer check (else it would not >> have cleared otherwise) >> >> if (pmd_trans_huge(pmde) || is_pmd_migration_entry(pmde)). > > If a THP is under splitting, both pmd_present() and pmd_trans_huge() return > true in x86. The else part (/* THP pmd was split under us … */) happens > after splitting is done. Okay, got it. > >> BTW what PMD state does the outer 'else if' block identify which must >> have cleared the following condition to get there. >> >> (!pmd_present && !pmd_trans_huge && !is_pmd_migration_entry) > > I think it is the case that the PMD is gone or equivalently pmd_none(). > This PMD entry is not in use. Okay, got it.