> On Jun 15, 2020, at 12:45 PM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > > On Mon, Jun 15, 2020 at 10:06:20AM -0700, Andy Lutomirski wrote: >>> On Mon, Jun 15, 2020 at 7:50 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: >> >> Hmm. IMO you're making two changes here, and this is fiddly enough >> that it might be worth separating them for bisection purposes. > > Sure, can do. > >>> --- >>> >>> diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c >>> index af75109485c26..a47e74923c4c8 100644 >>> --- a/arch/x86/kernel/traps.c >>> +++ b/arch/x86/kernel/traps.c >>> @@ -218,21 +218,22 @@ static inline void handle_invalid_op(struct pt_regs *regs) >>> >>> DEFINE_IDTENTRY_RAW(exc_invalid_op) >>> { >>> - bool rcu_exit; >>> - >>> /* >>> * Handle BUG/WARN like NMIs instead of like normal idtentries: >>> * if we bugged/warned in a bad RCU context, for example, the last >>> * thing we want is to BUG/WARN again in the idtentry code, ad >>> * infinitum. >>> */ >>> - if (!user_mode(regs) && is_valid_bugaddr(regs->ip)) { >>> - enum bug_trap_type type; >>> + if (!user_mode(regs)) { >>> + enum bug_trap_type type = BUG_TRAP_TYPE_NONE; >>> >>> nmi_enter(); >>> instrumentation_begin(); >>> trace_hardirqs_off_finish(); >>> - type = report_bug(regs->ip, regs); >>> + >>> + if (is_valid_bugaddr(regs->ip)) >>> + type = report_bug(regs->ip, regs); >>> + >> >> Sigh, this is indeed necessary. > > :-) > >>> if (regs->flags & X86_EFLAGS_IF) >>> trace_hardirqs_on_prepare(); >>> instrumentation_end(); >>> @@ -249,13 +250,16 @@ DEFINE_IDTENTRY_RAW(exc_invalid_op) >>> * was just a normal #UD, we want to continue onward and >>> * crash. >>> */ >>> - } >>> + handle_invalid_op(regs); >> >> But this is really a separate change. This makes handle_invalid_op() >> be NMI-like even for non-BUG/WARN kernel #UD entries. One might argue >> that this doesn't matter, and that's probably right, but I think it >> should be its own change with its own justification. With just my >> patch, I intentionally call handle_invalid_op() via the normal >> idtentry_enter_cond_rcu() path. > > All !user exceptions really should be NMI-like. If you want to go > overboard, I suppose you can look at IF and have them behave interrupt > like when set, but why make things complicated. This entire rabbit hole opened because of #PF. So we at least need the set of exceptions that are permitted to schedule if they came from kernel mode to remain schedulable. Prior to the giant changes, all the non-IST *exceptions*, but not the interrupts, were schedulable from kernel mode, assuming the original context could schedule. Right now, interrupts can schedule, too, which is nice if we ever want to fully clean up the Xen abomination. I suppose we could make it so #PF opts in to special treatment again, but we should decide that the result is simpler or otherwise better before we do this. One possible justification would be that the schedulable entry variant is more complicated, and most kernel exceptions except the ones with fixups are bad news, and we want the oopses to succeed. But page faults are probably the most common source of oopses, so this is a bit weak, and we really want page faults to work even from nasty contexts. > > Anyway, let me to smaller and proper patches for this.