On Tue, 3 Mar 2015 10:44:24 -0700 Toshi Kani <toshi.kani@xxxxxx> wrote: > This patch implements huge KVA mapping interfaces on x86. > > On x86, MTRRs can override PAT memory types with a 4KB granularity. > When using a huge page, MTRRs can override the memory type of the > huge page, which may lead a performance penalty. The processor > can also behave in an undefined manner if a huge page is mapped to > a memory range that MTRRs have mapped with multiple different memory > types. Therefore, the mapping code falls back to use a smaller page > size toward 4KB when a mapping range is covered by non-WB type of > MTRRs. The WB type of MTRRs has no affect on the PAT memory types. > > pud_set_huge() and pmd_set_huge() call mtrr_type_lookup() to see > if a given range is covered by MTRRs. MTRR_TYPE_WRBACK indicates > that the range is either covered by WB or not covered and the MTRR > default value is set to WB. 0xFF indicates that MTRRs are disabled. > > HAVE_ARCH_HUGE_VMAP is selected when X86_64 or X86_32 with X86_PAE > is set. X86_32 without X86_PAE is not supported since such config > can unlikey be benefited from this feature, and there was an issue > found in testing. > > ... > > + > +#ifdef CONFIG_HAVE_ARCH_HUGE_VMAP > +int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot) > +{ > + u8 mtrr; > + > + /* > + * Do not use a huge page when the range is covered by non-WB type > + * of MTRRs. > + */ > + mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE); > + if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != 0xFF)) > + return 0; It would be good to notify the operator in some way when this happens. Otherwise the kernel will run more slowly and there's no way of knowing why. I guess slap a pr_info() in there. Or maybe pr_warn()? > + prot = pgprot_4k_2_large(prot); > + > + set_pte((pte_t *)pud, pfn_pte( > + (u64)addr >> PAGE_SHIFT, > + __pgprot(pgprot_val(prot) | _PAGE_PSE))); > + > + return 1; > +} > + > +int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot) > +{ > + u8 mtrr; > + > + /* > + * Do not use a huge page when the range is covered by non-WB type > + * of MTRRs. > + */ > + mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE); > + if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != 0xFF)) > + return 0; > + > + prot = pgprot_4k_2_large(prot); > + > + set_pte((pte_t *)pmd, pfn_pte( > + (u64)addr >> PAGE_SHIFT, > + __pgprot(pgprot_val(prot) | _PAGE_PSE))); > + > + return 1; > +} > > +int pud_clear_huge(pud_t *pud) > +{ > + if (pud_large(*pud)) { > + pud_clear(pud); > + return 1; > + } > + > + return 0; > +} > + > +int pmd_clear_huge(pmd_t *pmd) > +{ > + if (pmd_large(*pmd)) { > + pmd_clear(pmd); > + return 1; > + } > + > + return 0; > +} I didn't see anywhere where the return values of these functions are documented. It's all fairly obvious, but we could help the rearers a bit. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>