Re: [RFC PATCH 2/6] x86/mm: temporary mm struct

Masami Hiramatsu <mhiramat@xxxxxxxxxx> · Thu, 30 Aug 2018 10:38:59 +0900

On Wed, 29 Aug 2018 08:41:00 -0700
Andy Lutomirski <luto@xxxxxxxxxx> wrote:

> On Wed, Aug 29, 2018 at 2:49 AM, Masami Hiramatsu <mhiramat@xxxxxxxxxx> wrote:
> > On Wed, 29 Aug 2018 01:11:43 -0700
> > Nadav Amit <namit@xxxxxxxxxx> wrote:
> >
> >> From: Andy Lutomirski <luto@xxxxxxxxxx>
> >>
> >> Sometimes we want to set a temporary page-table entries (PTEs) in one of
> >> the cores, without allowing other cores to use - even speculatively -
> >> these mappings. There are two benefits for doing so:
> >>
> >> (1) Security: if sensitive PTEs are set, temporary mm prevents their use
> >> in other cores. This hardens the security as it prevents exploding a
> >> dangling pointer to overwrite sensitive data using the sensitive PTE.
> >>
> >> (2) Avoiding TLB shootdowns: the PTEs do not need to be flushed in
> >> remote page-tables.
> >>
> >> To do so a temporary mm_struct can be used. Mappings which are private
> >> for this mm can be set in the userspace part of the address-space.
> >> During the whole time in which the temporary mm is loaded, interrupts
> >> must be disabled.
> >>
> >> The first use-case for temporary PTEs, which will follow, is for poking
> >> the kernel text.
> >>
> >> [ Commit message was written by Nadav ]
> >>
> >> Cc: Andy Lutomirski <luto@xxxxxxxxxx>
> >> Cc: Masami Hiramatsu <mhiramat@xxxxxxxxxx>
> >> Cc: Kees Cook <keescook@xxxxxxxxxxxx>
> >> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> >> Signed-off-by: Nadav Amit <namit@xxxxxxxxxx>
> >> ---
> >>  arch/x86/include/asm/mmu_context.h | 20 ++++++++++++++++++++
> >>  1 file changed, 20 insertions(+)
> >>
> >> diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h
> >> index eeeb9289c764..96afc8c0cf15 100644
> >> --- a/arch/x86/include/asm/mmu_context.h
> >> +++ b/arch/x86/include/asm/mmu_context.h
> >> @@ -338,4 +338,24 @@ static inline unsigned long __get_current_cr3_fast(void)
> >>       return cr3;
> >>  }
> >>
> >> +typedef struct {
> >> +     struct mm_struct *prev;
> >> +} temporary_mm_state_t;
> >> +
> >> +static inline temporary_mm_state_t use_temporary_mm(struct mm_struct *mm)
> >> +{
> >> +     temporary_mm_state_t state;
> >> +
> >> +     lockdep_assert_irqs_disabled();
> >> +     state.prev = this_cpu_read(cpu_tlbstate.loaded_mm);
> >> +     switch_mm_irqs_off(NULL, mm, current);
> >> +     return state;
> >> +}
> >
> > Hmm, why don't we return mm_struct *prev directly?
> 
> I did it this way to make it easier to add future debugging stuff
> later. Also, when I first wrote this, I stashed the old CR3 instead
> of the old mm_struct, and it seemed like callers should be insulated
> from details like this.

Hmm, I see. But in that case, we should call it "struct temporary_mm"
and explicitly allocate (and pass) it, since we can not return the
data structure from stack. If we can combine it with new mm, it will
be more encapsulated e.g.

struct temporary_mm {
	struct mm_struct *mm;
	struct mm_struct *prev;
};

static struct temporary_mm poking_tmp_mm;

poking_init()
{
	if (init_temporary_mm(&tmp_mm, &init_mm))
		goto error;
	...
}

text_poke_safe()
{
	...
	use_temporary_mm(&tmp_mm);
	...
	unuse_temporary_mm(&tmp_mm);
}

Any thought?

Thanks,

-- 
Masami Hiramatsu <mhiramat@xxxxxxxxxx>