On Fri, Jul 24, 2020 at 2:25 PM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote: > > Ira, > > Thomas Gleixner <tglx@xxxxxxxxxxxxx> writes: > > Ira Weiny <ira.weiny@xxxxxxxxx> writes: > >> On Thu, Jul 23, 2020 at 09:53:20PM +0200, Thomas Gleixner wrote: > >> I think, after fixing my code (see below), using idtentry_state could still > >> work. If the per-cpu cache and the MSR is updated in idtentry_exit() that > >> should carry the state to the new cpu, correct? > > > > I'm way too tired to think about that now. Will have a look tomorrow > > with brain awake. > > Not that I'm way more awake now, but at least I have the feeling that my > brain is not completely useless. > > Let me summarize what I understood: > > 1) A per CPU cache which shadows the current state of the MSR, i.e. the > current valid key. You use that to avoid costly MSR writes if the > key does not change. > > 2) On idtentry you store the key on entry in idtentry_state, clear it > in the MSR and shadow state if necessary and restore it on exit. > > 3) On context switch out you save the per CPU cache value in the task > and on context switch in you restore it from there. > > Yes, that works (see below for #2) and sorry for my confusion yesterday > about storing this in task state. > > #2 requires to handle the exceptions which do not go through > idtentry_enter/exit() seperately, but that's a manageable amount. It's > the ones which use IDTENTRY_RAW or a variant of it. > > #BP, #MC, #NMI, #DB, #DF need extra local storage as all the kernel > entries for those use nmi_enter()/exit(). So you just can create > wrappers around those. Somehting like this > > static __always_inline idtentry_state_t idtentry_nmi_enter(void) > { > idtentry_state_t state = {}; > > nmi_enter(); > instrumentation_begin(); > state.key = save_and_clear_key(); > instrumentation_end(); > } > > static __always_inline void idtentry_nmi_exit(idtentry_state_t state) > { > instrumentation_begin(); > restore_key(state.key); > instrumentation_end(); > nmi_exit(); > } > > #UD and #PF are using the raw entry variant as well but still invoke > idtentry_enter()/exit(). #PF does not need any work. #UD handles > WARN/BUG without going through idtentry_enter() first, but I don't think > that's an issue unless a not 0 key would prevent writing to the console > device. You surely can figure that out. Putting on my mm maintainer hat for a moment, I really think that we want oopses to print PKRS along with all the other registers when we inevitably oops due to a page fault. And we probably also want it in the nasty nested case where we get infinite page faults and eventually double fault. I'm sure it's *possible* to wire this up if we stick it in idtentry_state, but it's trivial if we stick it in pt_regs. I'm okay with doing the save/restore in C (in fact, I prefer that), but I think that either the value should be stuck in pt_regs or we should find a way to teach the unwinder to locate idtentry_state. And, if we go with idtentry_state, we should make a signature change to nmi_enter() to also provide idtentry_state or some equivalent object. --Andy