On 04/05/2022, 07:59:29, Michael Ellerman wrote: > Laurent Dufour <ldufour@xxxxxxxxxxxxx> writes: >> On 03/05/2022, 17:06:41, Michael Ellerman wrote: >>> Laurent Dufour <ldufour@xxxxxxxxxxxxx> writes: > ... >>>> diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c >>>> index 1f42aabbbab3..d7775b8c8853 100644 >>>> --- a/arch/powerpc/kernel/rtas.c >>>> +++ b/arch/powerpc/kernel/rtas.c >>>> @@ -49,6 +49,11 @@ void enter_rtas(unsigned long); >>>> >>>> static inline void do_enter_rtas(unsigned long args) >>>> { >>>> + unsigned long msr; >>>> + >>>> + msr = mfmsr(); >>>> + BUG_ON(!(msr & MSR_RI)); >>> >>> I'm not sure about this. >>> >>> We call RTAS in some low-level places, so if we ever hit this BUG_ON >>> then it might cause us to crash badly, or recursively BUG. >>> >>> A WARN_ON_ONCE() might be safer? >> >> I'm afraid a BUG_ON is required here. Since MSR[RI] is set on RTAS exit so >> if it was not set when calling RTAS, that's a real issue and should >> generate unexpected behaviour. >> >> Do you have places in mind where RTAS could be called with !MSR[RI]? > > The main one I can think of is if someone is using > CONFIG_UDBG_RTAS_CONSOLE, then udbg_rtascon_putc() is wired up as > udbg_putc() and that might be called from anywhere, including xmon. > > There's also RTAS calls in low-level xics interrupt code, that might get > called during panic/crash. > > I don't expect any of those places to be called with MSR[RI] unset, but > I'm worried that if we're already crashing and for some reason MSR[RI] > is unset, then that BUG_ON will just make things worse. > > eg. imagine taking a BUG_ON() for every character we try to print as > part of an oops. > > Admittedly CONFIG_UDBG_RTAS_CONSOLE is old and probably not used much > anymore, but I'm still a bit paranoid :) I think you're right to be paranoid :) This part of code can be really sensitive. I boot a kernel built with CONFIG_UDBG_RTAS_CONSOLE, xmon is working fine, but I cannot pretend this is covering all the RTAS call cases. My hope with BUG_ON() is to raise the issue, as soon as possible, so it can be addressed during the test phase.