On Mon, Jul 17, 2023 at 6:45 PM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > > On Mon, Jul 17, 2023 at 07:33:25AM +0800, Guo Ren wrote: > > On Mon, Jul 10, 2023 at 4:02 PM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > > > > > > On Sun, Jul 09, 2023 at 10:30:22AM +0800, Guo Ren wrote: > > > > On Wed, Jul 5, 2023 at 12:40 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > > > > > > > > > > On Sat, Jul 01, 2023 at 10:57:07PM -0400, guoren@xxxxxxxxxx wrote: > > > > > > From: Guo Ren <guoren@xxxxxxxxxxxxxxxxx> > > > > > > > > > > > > The irqentry_nmi_enter/exit would force the current context into in_interrupt. > > > > > > That would trigger the kernel to dead panic, but the kdb still needs "ebreak" to > > > > > > debug the kernel. > > > > > > > > > > > > Move irqentry_nmi_enter/exit to exception_enter/exit could correct handle_break > > > > > > of the kernel side. > > > > > > > > > > This doesn't explain much if anything :/ > > > > > > > > > > I'm confused (probably because I don't know RISC-V very well), what's > > > > > EBREAK and how does it happen? > > > > EBREAK is just an instruction of riscv which would rise breakpoint exception. > > > > > > > > > > > > > > > > > > Specifically, if EBREAK can happen inside an local_irq_disable() region, > > > > > then the below change is actively wrong. Any exception/interrupt that > > > > > can happen while local_irq_disable() must be treated like an NMI. > > > > When the ebreak happend out of local_irq_disable region, but > > > > __nmi_enter forces handle_break() into in_interupt() state. So how > > > > > > And why is that a problem? I think I'm missing something fundamental > > > here... > > The irqentry_nmi_enter() would force the current context to get > > in_interrupt=true, although ebreak happens in the context which is > > in_interrupt=false. > > A lot of checking codes, such as: > > if (in_interrupt()) > > panic("Fatal exception in interrupt"); > > Why would you do that?!? > > Are you're trying to differentiate between an exception and an > interrupt? > > You *could* have ebreak in an interrupt, right? So why panic the machine > if that happens? Do you mean the below patch? Yes, it could fix up. diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c index f910dfccbf5d..92899db6696b 100644 --- a/arch/riscv/kernel/traps.c +++ b/arch/riscv/kernel/traps.c @@ -85,8 +85,6 @@ void die(struct pt_regs *regs, const char *str) spin_unlock_irqrestore(&die_lock, flags); oops_exit(); - if (in_interrupt()) - panic("Fatal exception in interrupt"); if (panic_on_oops) panic("Fatal exception"); if (ret != NOTIFY_STOP) diff --git a/kernel/exit.c b/kernel/exit.c index edb50b4c9972..a46a1aef66ce 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -940,8 +940,6 @@ void __noreturn make_task_dead(int signr) struct task_struct *tsk = current; unsigned int limit; - if (unlikely(in_interrupt())) - panic("Aiee, killing interrupt handler!"); if (unlikely(!tsk->pid)) panic("Attempted to kill the idle task!"); But how does x86 deal with it without kernel/exit.c modifcation? > > > It would make the kernel panic, but we don't panic; we want back to the shell. > > eg: > > echo BUG > /sys/kernel/debug/provoke-crash/DIRECT > > > -- Best Regards Guo Ren