Re: [PATCH RFC V2 17/17] x86/entry: Preserve PKRS MSR across exceptions

Andy Lutomirski <luto@xxxxxxxxxx> · Fri, 24 Jul 2020 17:09:00 -0700

On Fri, Jul 24, 2020 at 2:25 PM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>
> Ira,
>
> Thomas Gleixner <tglx@xxxxxxxxxxxxx> writes:
> > Ira Weiny <ira.weiny@xxxxxxxxx> writes:
> >> On Thu, Jul 23, 2020 at 09:53:20PM +0200, Thomas Gleixner wrote:
> >> I think, after fixing my code (see below), using idtentry_state could still
> >> work.  If the per-cpu cache and the MSR is updated in idtentry_exit() that
> >> should carry the state to the new cpu, correct?
> >
> > I'm way too tired to think about that now. Will have a look tomorrow
> > with brain awake.
>
> Not that I'm way more awake now, but at least I have the feeling that my
> brain is not completely useless.
>
> Let me summarize what I understood:
>
>   1) A per CPU cache which shadows the current state of the MSR, i.e. the
>      current valid key. You use that to avoid costly MSR writes if the
>      key does not change.
>
>   2) On idtentry you store the key on entry in idtentry_state, clear it
>      in the MSR and shadow state if necessary and restore it on exit.
>
>   3) On context switch out you save the per CPU cache value in the task
>      and on context switch in you restore it from there.
>
> Yes, that works (see below for #2) and sorry for my confusion yesterday
> about storing this in task state.
>
> #2 requires to handle the exceptions which do not go through
> idtentry_enter/exit() seperately, but that's a manageable amount. It's
> the ones which use IDTENTRY_RAW or a variant of it.
>
> #BP, #MC, #NMI, #DB, #DF need extra local storage as all the kernel
> entries for those use nmi_enter()/exit(). So you just can create
> wrappers around those. Somehting like this
>
> static __always_inline idtentry_state_t idtentry_nmi_enter(void)
> {
>         idtentry_state_t state = {};
>
>         nmi_enter();
>         instrumentation_begin();
>         state.key = save_and_clear_key();
>         instrumentation_end();
> }
>
> static __always_inline void idtentry_nmi_exit(idtentry_state_t state)
> {
>         instrumentation_begin();
>         restore_key(state.key);
>         instrumentation_end();
>         nmi_exit();
> }
>
> #UD and #PF are using the raw entry variant as well but still invoke
> idtentry_enter()/exit(). #PF does not need any work. #UD handles
> WARN/BUG without going through idtentry_enter() first, but I don't think
> that's an issue unless a not 0 key would prevent writing to the console
> device. You surely can figure that out.

Putting on my mm maintainer hat for a moment, I really think that we
want oopses to print PKRS along with all the other registers when we
inevitably oops due to a page fault.  And we probably also want it in
the nasty nested case where we get infinite page faults and eventually
double fault.

I'm sure it's *possible* to wire this up if we stick it in
idtentry_state, but it's trivial if we stick it in pt_regs.  I'm okay
with doing the save/restore in C (in fact, I prefer that), but I think
that either the value should be stuck in pt_regs or we should find a
way to teach the unwinder to locate idtentry_state.

And, if we go with idtentry_state, we should make a signature change
to nmi_enter() to also provide idtentry_state or some equivalent
object.

--Andy