On Mon, May 04, 2015 at 10:59:16PM +0530, Aneesh Kumar K.V wrote: > Archs like ppc64 require pte_t * to remain stable in some code path. > They use local_irq_disable to prevent a parallel split. Generic code > clear pmd instead of marking it _PAGE_SPLITTING in code path > where we can afford to mark pmd none before splitting. Use a > variant of pmdp_splitting_clear_notify that arch can override. > > Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxxxxxxx> Sorry, I still try wrap my head around this problem. So, Power has __find_linux_pte_or_hugepte() which does lock-less lookup in page tables with local interrupts disabled. For huge pages it casts pmd_t to pte_t. Since format of pte_t is different from pmd_t we want to prevent transit from pmd pointing to page table to pmd pinging to huge page (and back) while interrupts are disabled. The complication for Power is that it doesn't do implicit IPI on tlb flush. Is it correct? For THP, split_huge_page() and collapse sides are covered. This patch should address two cases of splitting PMD, but not compound page in current upstream. But I think there's still *big* problem for Power -- zap_huge_pmd(). For instance: other CPU can shoot out a THP PMD with MADV_DONTNEED and fault in small pages instead. IIUC, for __find_linux_pte_or_hugepte(), it's equivalent of splitting. I don't see how this can be fixed without kick_all_cpus_sync() in all pmdp_clear_flush() on Power. -- Kirill A. Shutemov -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>