at 8:41 AM, Andy Lutomirski <luto@xxxxxxxxxx> wrote: > On Wed, Aug 29, 2018 at 2:49 AM, Masami Hiramatsu <mhiramat@xxxxxxxxxx> wrote: >> On Wed, 29 Aug 2018 01:11:43 -0700 >> Nadav Amit <namit@xxxxxxxxxx> wrote: >> >>> From: Andy Lutomirski <luto@xxxxxxxxxx> >>> >>> Sometimes we want to set a temporary page-table entries (PTEs) in one of >>> the cores, without allowing other cores to use - even speculatively - >>> these mappings. There are two benefits for doing so: >>> >>> (1) Security: if sensitive PTEs are set, temporary mm prevents their use >>> in other cores. This hardens the security as it prevents exploding a >>> dangling pointer to overwrite sensitive data using the sensitive PTE. >>> >>> (2) Avoiding TLB shootdowns: the PTEs do not need to be flushed in >>> remote page-tables. >>> >>> To do so a temporary mm_struct can be used. Mappings which are private >>> for this mm can be set in the userspace part of the address-space. >>> During the whole time in which the temporary mm is loaded, interrupts >>> must be disabled. >>> >>> The first use-case for temporary PTEs, which will follow, is for poking >>> the kernel text. >>> >>> [ Commit message was written by Nadav ] >>> >>> Cc: Andy Lutomirski <luto@xxxxxxxxxx> >>> Cc: Masami Hiramatsu <mhiramat@xxxxxxxxxx> >>> Cc: Kees Cook <keescook@xxxxxxxxxxxx> >>> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx> >>> Signed-off-by: Nadav Amit <namit@xxxxxxxxxx> >>> --- >>> arch/x86/include/asm/mmu_context.h | 20 ++++++++++++++++++++ >>> 1 file changed, 20 insertions(+) >>> >>> diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h >>> index eeeb9289c764..96afc8c0cf15 100644 >>> --- a/arch/x86/include/asm/mmu_context.h >>> +++ b/arch/x86/include/asm/mmu_context.h >>> @@ -338,4 +338,24 @@ static inline unsigned long __get_current_cr3_fast(void) >>> return cr3; >>> } >>> >>> +typedef struct { >>> + struct mm_struct *prev; >>> +} temporary_mm_state_t; >>> + >>> +static inline temporary_mm_state_t use_temporary_mm(struct mm_struct *mm) >>> +{ >>> + temporary_mm_state_t state; >>> + >>> + lockdep_assert_irqs_disabled(); >>> + state.prev = this_cpu_read(cpu_tlbstate.loaded_mm); >>> + switch_mm_irqs_off(NULL, mm, current); >>> + return state; >>> +} >> >> Hmm, why don't we return mm_struct *prev directly? > > I did it this way to make it easier to add future debugging stuff > later. Also, when I first wrote this, I stashed the old CR3 instead > of the old mm_struct, and it seemed like callers should be insulated > from details like this. Andy, please let me know if you want me to change it somehow, and please provide your signed-off-by.