Michael Ellerman's on March 3, 2020 9:43 pm: > Nicholas Piggin <npiggin@xxxxxxxxx> writes: >> Use ARCH_HAS_ADDRESS_LOOKUP to look up the opal symbol table. This >> allows crashes and xmon debugging to print firmware symbols. >> >> Oops: System Reset, sig: 6 [#1] >> LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV >> Modules linked in: >> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.6.0-rc2-dirty #903 >> NIP: 0000000030020434 LR: 000000003000378c CTR: 0000000030020414 >> REGS: c0000000fffc3d70 TRAP: 0100 Not tainted (5.6.0-rc2-dirty) >> MSR: 9000000002101002 <SF,HV,VEC,ME,RI> CR: 28022284 XER: 20040000 >> CFAR: 0000000030003788 IRQMASK: 3 >> GPR00: 000000003000378c 0000000031c13c90 0000000030136200 c0000000012cfa10 >> GPR04: c0000000012cfa10 0000000000000010 0000000000000000 0000000031c10060 >> GPR08: c0000000012cfaaf 0000000030003640 0000000000000000 0000000000000001 >> GPR12: 00000000300e0000 c000000001490000 0000000000000000 c00000000139c588 >> GPR16: 0000000031c10000 c00000000125a900 0000000000000000 c0000000012076a8 >> GPR20: c0000000012a3950 0000000000000001 0000000031c10060 c0000000012cfaaf >> GPR24: 0000000000000019 0000000030003640 0000000000000000 0000000000000000 >> GPR28: 0000000000000010 c0000000012cfa10 0000000000000000 0000000000000000 >> NIP [0000000030020434] .dummy_console_write_buffer_space+0x20/0x64 [OPAL] >> LR [000000003000378c] opal_entry+0x14c/0x17c [OPAL] >> >> This won't unwind the firmware stack (or its Linux caller) properly if >> firmware and kernel endians don't match, but that problem could be solved >> in powerpc's unwinder. > > How well does this work if we're tracing opal calls at the time we oops :) > > Though it looks like that's already fishy because we don't do anything > to disable tracing of opal_console_write(). Yeah we don't do perfectly well in this case still. OPAL itself has locks in its console paths and some issues with stack reentrancy. We should do a bit better with cutting out more junk including tracing from crash paths, so this doesn't fundamentally make things harder. > I guess I'm a bit wary of adding numerous further opal calls in the oops > path, I'm sure the opal symbol lookup code is bug free, but still. There's a few, console write, event poll, reboot, and NMI IPI AFAIK, so we have to make the opal call path itself robust (it's getting there). > Could we instead suck in the opal symbols early on, and search them in > Linux? I suspect you've thought of that and rejected it, but it would be > good to document why. We could, I was thinking we might want OPAL to do something special with them like add module annotations [OPAL] vs [HOMER] or whatever, relocate itself after boot if we randomize where it's loaded etc. but perhaps none of those things really prevent the symbols being discovered at boot time. I don't know, it was easier? :) >> diff --git a/arch/powerpc/include/asm/opal-api.h b/arch/powerpc/include/asm/opal-api.h >> index c1f25a760eb1..c3a2a797177a 100644 >> --- a/arch/powerpc/include/asm/opal-api.h >> +++ b/arch/powerpc/include/asm/opal-api.h >> @@ -214,7 +214,11 @@ >> #define OPAL_SECVAR_GET 176 >> #define OPAL_SECVAR_GET_NEXT 177 >> #define OPAL_SECVAR_ENQUEUE_UPDATE 178 >> -#define OPAL_LAST 178 >> +#define OPAL_PHB_SET_OPTION 179 >> +#define OPAL_PHB_GET_OPTION 180 > > Only pull in the calls you need for this patch. Ah okay I didn't realise that was the policy, makes sense. Thanks, Nick