On 08/04/14 17:16, H. Peter Anvin wrote: > On 04/08/2014 09:02 AM, Konrad Rzeszutek Wilk wrote: >>>> >>>> Amazon EC2 does have large memory instance types with NUMA exposed to >>>> the guest (e.g. c3.8xlarge, i2.8xlarge, etc), so it'd be preferable >>>> (to me anyway) if we didn't require !XEN. >> >> What about the patch that David Vrabel posted: >> >> http://osdir.com/ml/general/2014-03/msg41979.html >> >> Has anybody taken it for a spin? >> > > Oh lovely, more pvops in low level paths. I'm so thrilled. > > Incidentally, I wasn't even Cc:'d on that patch and was only added to > the thread by Linus, but never saw the early bits of the thread > including the actual patch. I did resend a version CC'd to all the x86 maintainers and included some performance figures for native (~1 extra clock cycle). I've included it again below. My preference would be take this patch as it fixes it for both NUMA rebalancing and any future uses that want to set/clear _PAGE_PRESENT. David 8<-------------- x86: use pv-ops in {pte, pmd}_{set,clear}_flags() Instead of using native functions to operate on the PTEs in pte_set_flags(), pte_clear_flags(), pmd_set_flags(), pmd_clear_flags() use the PV aware ones. This fixes a regression in Xen PV guests introduced by 1667918b6483 (mm: numa: clear numa hinting information on mprotect). This has negligible performance impact on native since the pte_val() and __pte() (etc.) calls are patched at runtime when running on bare metal. Measurements on a 3 GHz AMD 4284 give approx. 0.3 ns (~1 clock cycle) of additional time for each function. Xen PV guest page tables require that their entries use machine addresses if the preset bit (_PAGE_PRESENT) is set, and (for successful migration) non-present PTEs must use pseudo-physical addresses. This is because on migration MFNs only present PTEs are translated to PFNs (canonicalised) so they may be translated back to the new MFN in the destination domain (uncanonicalised). pte_mknonnuma(), pmd_mknonnuma(), pte_mknuma() and pmd_mknuma() set and clear the _PAGE_PRESENT bit using pte_set_flags(), pte_clear_flags(), etc. In a Xen PV guest, these functions must translate MFNs to PFNs when clearing _PAGE_PRESENT and translate PFNs to MFNs when setting _PAGE_PRESENT. Signed-off-by: David Vrabel <david.vrabel@xxxxxxxxxx> Cc: Steven Noonan <steven@xxxxxxxxxxxxxx> Cc: Elena Ufimtseva <ufimtseva@xxxxxxxxx> Cc: Mel Gorman <mgorman@xxxxxxx> Cc: <stable@xxxxxxxxxxxxxxx> [3.12+] --- arch/x86/include/asm/pgtable.h | 12 ++++++------ 1 files changed, 6 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index bbc8b12..323e5e2 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -174,16 +174,16 @@ static inline int has_transparent_hugepage(void) static inline pte_t pte_set_flags(pte_t pte, pteval_t set) { - pteval_t v = native_pte_val(pte); + pteval_t v = pte_val(pte); - return native_make_pte(v | set); + return __pte(v | set); } static inline pte_t pte_clear_flags(pte_t pte, pteval_t clear) { - pteval_t v = native_pte_val(pte); + pteval_t v = pte_val(pte); - return native_make_pte(v & ~clear); + return __pte(v & ~clear); } static inline pte_t pte_mkclean(pte_t pte) @@ -248,14 +248,14 @@ static inline pte_t pte_mkspecial(pte_t pte) static inline pmd_t pmd_set_flags(pmd_t pmd, pmdval_t set) { - pmdval_t v = native_pmd_val(pmd); + pmdval_t v = pmd_val(pmd); return __pmd(v | set); } static inline pmd_t pmd_clear_flags(pmd_t pmd, pmdval_t clear) { - pmdval_t v = native_pmd_val(pmd); + pmdval_t v = pmd_val(pmd); return __pmd(v & ~clear); } -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>