On Wed, Aug 29, 2018 at 2:49 AM, Masami Hiramatsu <mhiramat@xxxxxxxxxx> wrote: > On Wed, 29 Aug 2018 01:11:43 -0700 > Nadav Amit <namit@xxxxxxxxxx> wrote: > >> From: Andy Lutomirski <luto@xxxxxxxxxx> >> >> Sometimes we want to set a temporary page-table entries (PTEs) in one of >> the cores, without allowing other cores to use - even speculatively - >> these mappings. There are two benefits for doing so: >> >> (1) Security: if sensitive PTEs are set, temporary mm prevents their use >> in other cores. This hardens the security as it prevents exploding a >> dangling pointer to overwrite sensitive data using the sensitive PTE. >> >> (2) Avoiding TLB shootdowns: the PTEs do not need to be flushed in >> remote page-tables. >> >> To do so a temporary mm_struct can be used. Mappings which are private >> for this mm can be set in the userspace part of the address-space. >> During the whole time in which the temporary mm is loaded, interrupts >> must be disabled. >> >> The first use-case for temporary PTEs, which will follow, is for poking >> the kernel text. >> >> [ Commit message was written by Nadav ] >> >> Cc: Andy Lutomirski <luto@xxxxxxxxxx> >> Cc: Masami Hiramatsu <mhiramat@xxxxxxxxxx> >> Cc: Kees Cook <keescook@xxxxxxxxxxxx> >> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx> >> Signed-off-by: Nadav Amit <namit@xxxxxxxxxx> >> --- >> arch/x86/include/asm/mmu_context.h | 20 ++++++++++++++++++++ >> 1 file changed, 20 insertions(+) >> >> diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h >> index eeeb9289c764..96afc8c0cf15 100644 >> --- a/arch/x86/include/asm/mmu_context.h >> +++ b/arch/x86/include/asm/mmu_context.h >> @@ -338,4 +338,24 @@ static inline unsigned long __get_current_cr3_fast(void) >> return cr3; >> } >> >> +typedef struct { >> + struct mm_struct *prev; >> +} temporary_mm_state_t; >> + >> +static inline temporary_mm_state_t use_temporary_mm(struct mm_struct *mm) >> +{ >> + temporary_mm_state_t state; >> + >> + lockdep_assert_irqs_disabled(); >> + state.prev = this_cpu_read(cpu_tlbstate.loaded_mm); >> + switch_mm_irqs_off(NULL, mm, current); >> + return state; >> +} > > Hmm, why don't we return mm_struct *prev directly? I did it this way to make it easier to add future debugging stuff later. Also, when I first wrote this, I stashed the old CR3 instead of the old mm_struct, and it seemed like callers should be insulated from details like this.