Re: [PATCH 00/11] [v5] Use global pages with PTI

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 4/6/2018 3:55 PM, Dave Hansen wrote:
> Changes from v4
>  * Fix compile error reported by Tom Lendacky

This built with CONFIG_RANDOMIZE_BASE=y, but failed to boot successfully.
I think you're missing the initialization of __default_kernel_pte_mask in
kaslr.c.

Thanks,
Tom

>  * Avoid setting _PAGE_GLOBAL on non-present entries
> 
> Changes from v3:
>  * Fix whitespace issue noticed by willy
>  * Clarify comments about X86_FEATURE_PGE checks
>  * Clarify commit message around the necessity of _PAGE_GLOBAL
>    filtering when CR4.PGE=0 or PGE is unsupported.
> 
> Changes from v2:
> 
>  * Add performance numbers to changelogs
>  * Fix compile error resulting from use of x86-specific
>    __default_kernel_pte_mask in arch-generic mm/early_ioremap.c
>  * Delay kernel text cloning until after we are done messing
>    with it (patch 11).
>  * Blacklist K8 explicitly from mapping all kernel text as
>    global (this should never happen because K8 does not use
>    pti when pti=auto, but we on the safe side). (patch 11)
> 
> --
> 
> The later versions of the KAISER patches (pre-PTI) allowed the
> user/kernel shared areas to be GLOBAL.  The thought was that this would
> reduce the TLB overhead of keeping two copies of these mappings.
> 
> During the switch over to PTI, we seem to have lost our ability to have
> GLOBAL mappings.  This adds them back.
> 
> To measure the benefits of this, I took a modern Atom system without
> PCIDs and ran a microbenchmark[1] (higher is better):
> 
> No Global Lines (baseline  ): 6077741 lseeks/sec
> 88 Global Lines (kern entry): 7528609 lseeks/sec (+23.9%)
> 94 Global Lines (all ktext ): 8433111 lseeks/sec (+38.8%)
> 
> On a modern Skylake desktop with PCIDs, the benefits are tangible, but not
> huge:
> 
> No Global pages (baseline): 15783951 lseeks/sec
> 28 Global pages (this set): 16054688 lseeks/sec
>                              +270737 lseeks/sec (+1.71%)
> 
> I also double-checked with a kernel compile on the Skylake system (lower
> is better):
> 
> No Global pages (baseline): 186.951 seconds time elapsed  ( +-  0.35% )
> 28 Global pages (this set): 185.756 seconds time elapsed  ( +-  0.09% )
>                              -1.195 seconds (-0.64%)
> 
> 1. https://github.com/antonblanchard/will-it-scale/blob/master/tests/lseek1.c
> 
> Cc: Andrea Arcangeli <aarcange@xxxxxxxxxx>
> Cc: Andy Lutomirski <luto@xxxxxxxxxx>
> Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
> Cc: Kees Cook <keescook@xxxxxxxxxx>
> Cc: Hugh Dickins <hughd@xxxxxxxxxx>
> Cc: Juergen Gross <jgross@xxxxxxxx>
> Cc: x86@xxxxxxxxxx
> Cc: Nadav Amit <namit@xxxxxxxxxx>
> 




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux