On Mon, Sep 08, 2014 at 12:13:16AM -0700, Hugh Dickins wrote: > > > One subtlety to take care over: it's a long time since I've had to > > > worry about pmd folding and pud folding (what happens when you only > > > have 2 or 3 levels of page table instead of the full 4): macros get > > > defined to each other, and levels get optimized out (perhaps > > > differently on different architectures). > > > > > > So although at first sight the lock to take in follow_huge_pud() > > > would seem to be mm->page_table_lock, I am not at this point certain > > > that that's necessarily so - sometimes pud_huge might be pmd_huge, > > > and the size PMD_SIZE, and pmd_lockptr appropriate at what appears > > > to be the pud level. Maybe: needs checking through the architectures > > > and their configs, not obvious to me. > > > > I think that every architecture uses mm->page_table_lock for pud-level > > locking at least for now, but that could be changed in the future, > > for example when 1GB hugepages or pud-based hugepages become common and > > someone are interested in splitting lock for pud level. > > I'm not convinced by your answer, that you understand the (perhaps > imaginary!) issue I'm referring to. Try grep for __PAGETABLE_P.D_FOLDED. > > Our infrastructure allows for 4 levels of pagetable, pgd pud pmd pte, > but many architectures/configurations support only 2 or 3 levels. > What pud functions and pmd functions work out to be in those > configs is confusing, and varies from architecture to architecture. > > In particular, pud and pmd may be different expressions of the same > thing (with 1 pmd per pud, instead of say 512). In that case PUD_SIZE > will equal PMD_SIZE: and then at the pud level huge_pte_lockptr() > will be using split locking instead of mm->page_table_lock. <sorry for delay -- just back from vacation> Look like we can't have PMD folded unless PUD is folded too: include/asm-generic/pgtable-nopmd.h:#include <asm-generic/pgtable-nopud.h> It means we have three cases: - Both PMD and PUD are not folded. PUD_SIZE == PMD_SIZE can be true only if PUD table consits from one entry which is emm.. strange. - PUD folded, PMD is not. In this case PUD_SIZE is equal to PGDIR_SIZE which is always (I believe) greater than PMD_SIZE. - Both are folded: PMD_SIZE == PUD_SIZE == PGDIR_SIZE, but we solve it with ARCH_ENABLE_SPLIT_PMD_PTLOCK. It only enabled on configuration with where PMD is not folded. Without ARCH_ENABLE_SPLIT_PMD_PTLOCK, pmd_lockptr() points to mm->page_table_lock. Does it make sense? > Many of the hugetlb architectures have a pud_huge() which just returns > 0, and we need not worry about those, nor the follow_huge_addr() powerpc. > But arm64, mips, tile, x86 look more interesting. > > Frankly, I find myself too dumb to be sure of the right answer for all: > and think that when we put the proper locking into follow_huge_pud(), > we shall have to include a PUD_SIZE == PMD_SIZE test, to let the > compiler decide for us which is the appropriate locking to match > huge_pte_lockptr(). > > Unless Kirill can illuminate: I may be afraid of complications > where actually there are none. I'm more worry about false-negative result of huge_page_size(h) == PMD_SIZE check. I can imagine that some architectures (power and ia64, i guess) allows several page sizes on the same page table level, but only one of them is PMD_SIZE. It seems not a problem currently since we enable split PMD lock only on x86 and s390. Possible solution is to annotate each hstate with page table level it corresponds to. -- Kirill A. Shutemov -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>