Re: [RFC PATCH v9 00/13] Add support for eXclusive Page Frame Ownership

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Khalid,

Thanks for these patches. We will test them on x86 and investigate the Arm pieces highlighted.

Jon.

-- 
Computer Architect


> On Apr 4, 2019, at 00:37, Khalid Aziz <khalid.aziz@xxxxxxxxxx> wrote:
> 
> This is another update to the work Juerg, Tycho and Julian have
> done on XPFO. After the last round of updates, we were seeing very
> significant performance penalties when stale TLB entries were
> flushed actively after an XPFO TLB update.  Benchmark for measuring
> performance is kernel build using parallel make. To get full
> protection from ret2dir attackes, we must flush stale TLB entries.
> Performance penalty from flushing stale TLB entries goes up as the
> number of cores goes up. On a desktop class machine with only 4
> cores, enabling TLB flush for stale entries causes system time for
> "make -j4" to go up by a factor of 2.61x but on a larger machine
> with 96 cores, system time with "make -j60" goes up by a factor of
> 26.37x!  I have been working on reducing this performance penalty.
> 
> I implemented two solutions to reduce performance penalty and that
> has had large impact. XPFO code flushes TLB every time a page is
> allocated to userspace. It does so by sending IPIs to all processors
> to flush TLB. Back to back allocations of pages to userspace on
> multiple processors results in a storm of IPIs.  Each one of these
> incoming IPIs is handled by a processor by flushing its TLB. To
> reduce this IPI storm, I have added a per CPU flag that can be set
> to tell a processor to flush its TLB. A processor checks this flag
> on every context switch. If the flag is set, it flushes its TLB and
> clears the flag. This allows for multiple TLB flush requests to a
> single CPU to be combined into a single request. A kernel TLB entry
> for a page that has been allocated to userspace is flushed on all
> processors unlike the previous version of this patch. A processor
> could hold a stale kernel TLB entry that was removed on another
> processor until the next context switch. A local userspace page
> allocation by the currently running process could force the TLB
> flush earlier for such entries.
> 
> The other solution reduces the number of TLB flushes required, by
> performing TLB flush for multiple pages at one time when pages are
> refilled on the per-cpu freelist. If the pages being addedd to
> per-cpu freelist are marked for userspace allocation, TLB entries
> for these pages can be flushed upfront and pages tagged as currently
> unmapped. When any such page is allocated to userspace, there is no
> need to performa a TLB flush at that time any more. This batching of
> TLB flushes reduces performance imapct further. Similarly when
> these user pages are freed by userspace and added back to per-cpu
> free list, they are left unmapped and tagged so. This further
> optimization reduced performance impact from 1.32x to 1.28x for
> 96-core server and from 1.31x to 1.27x for a 4-core desktop.
> 
> I measured system time for parallel make with unmodified 4.20
> kernel, 4.20 with XPFO patches before these patches and then again
> after applying each of these patches. Here are the results:
> 
> Hardware: 96-core Intel Xeon Platinum 8160 CPU @ 2.10GHz, 768 GB RAM
> make -j60 all
> 
> 5.0                    913.862s
> 5.0+this patch series            1165.259ss    1.28x
> 
> 
> Hardware: 4-core Intel Core i5-3550 CPU @ 3.30GHz, 8G RAM
> make -j4 all
> 
> 5.0                    610.642s
> 5.0+this patch series            773.075s    1.27x
> 
> Performance with this patch set is good enough to use these as
> starting point for further refinement before we merge it into main
> kernel, hence RFC.
> 
> I have restructurerd the patches in this version to separate out
> architecture independent code. I folded much of the code
> improvement by Julian to not use page extension into patch 3. 
> 
> What remains to be done beyond this patch series:
> 
> 1. Performance improvements: Ideas to explore - (1) kernel mappings
>   private to an mm, (2) Any others??
> 2. Re-evaluate the patch "arm64/mm: Add support for XPFO to swiotlb"
>   from Juerg. I dropped it for now since swiotlb code for ARM has
>   changed a lot since this patch was written. I could use help
>   from ARM experts on this.
> 3. Extend the patch "xpfo, mm: Defer TLB flushes for non-current
>   CPUs" to other architectures besides x86.
> 4. Change kmap to not map the page back to physmap, instead map it
>   to a new va similar to what kmap_high does. Mapping page back
>   into physmap re-opens the ret2dir security for the duration of
>   kmap. All of the kmap_high and related code can be reused for this
>   but that will require restructuring that code so it can be built for
>   64-bits as well. Any objections to that?
> 
> ---------------------------------------------------------
> 
> Juerg Haefliger (6):
>  mm: Add support for eXclusive Page Frame Ownership (XPFO)
>  xpfo, x86: Add support for XPFO for x86-64
>  lkdtm: Add test for XPFO
>  arm64/mm: Add support for XPFO
>  swiotlb: Map the buffer if it was unmapped by XPFO
>  arm64/mm, xpfo: temporarily map dcache regions
> 
> Julian Stecklina (1):
>  xpfo, mm: optimize spinlock usage in xpfo_kunmap
> 
> Khalid Aziz (2):
>  xpfo, mm: Defer TLB flushes for non-current CPUs (x86 only)
>  xpfo, mm: Optimize XPFO TLB flushes by batching them together
> 
> Tycho Andersen (4):
>  mm: add MAP_HUGETLB support to vm_mmap
>  x86: always set IF before oopsing from page fault
>  mm: add a user_virt_to_phys symbol
>  xpfo: add primitives for mapping underlying memory
> 
> .../admin-guide/kernel-parameters.txt         |   6 +
> arch/arm64/Kconfig                            |   1 +
> arch/arm64/mm/Makefile                        |   2 +
> arch/arm64/mm/flush.c                         |   7 +
> arch/arm64/mm/mmu.c                           |   2 +-
> arch/arm64/mm/xpfo.c                          |  66 ++++++
> arch/x86/Kconfig                              |   1 +
> arch/x86/include/asm/pgtable.h                |  26 +++
> arch/x86/include/asm/tlbflush.h               |   1 +
> arch/x86/mm/Makefile                          |   2 +
> arch/x86/mm/fault.c                           |   6 +
> arch/x86/mm/pageattr.c                        |  32 +--
> arch/x86/mm/tlb.c                             |  39 ++++
> arch/x86/mm/xpfo.c                            | 185 +++++++++++++++++
> drivers/misc/lkdtm/Makefile                   |   1 +
> drivers/misc/lkdtm/core.c                     |   3 +
> drivers/misc/lkdtm/lkdtm.h                    |   5 +
> drivers/misc/lkdtm/xpfo.c                     | 196 ++++++++++++++++++
> include/linux/highmem.h                       |  34 +--
> include/linux/mm.h                            |   2 +
> include/linux/mm_types.h                      |   8 +
> include/linux/page-flags.h                    |  23 +-
> include/linux/xpfo.h                          | 191 +++++++++++++++++
> include/trace/events/mmflags.h                |  10 +-
> kernel/dma/swiotlb.c                          |   3 +-
> mm/Makefile                                   |   1 +
> mm/compaction.c                               |   2 +-
> mm/internal.h                                 |   2 +-
> mm/mmap.c                                     |  19 +-
> mm/page_alloc.c                               |  19 +-
> mm/page_isolation.c                           |   2 +-
> mm/util.c                                     |  32 +++
> mm/xpfo.c                                     | 170 +++++++++++++++
> security/Kconfig                              |  27 +++
> 34 files changed, 1047 insertions(+), 79 deletions(-)
> create mode 100644 arch/arm64/mm/xpfo.c
> create mode 100644 arch/x86/mm/xpfo.c
> create mode 100644 drivers/misc/lkdtm/xpfo.c
> create mode 100644 include/linux/xpfo.h
> create mode 100644 mm/xpfo.c
> 
> -- 
> 2.17.1
> 





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux