On Fri Jun 21, 2024 at 4:54 AM AEST, Naveen N Rao wrote: > On 32-bit powerpc, gcc generates a three instruction sequence for > function profiling: > mflr r0 > stw r0, 4(r1) > bl _mcount > > On kernel boot, the call to _mcount() is nop-ed out, to be patched back > in when ftrace is actually enabled. The 'stw' instruction therefore is > not necessary unless ftrace is enabled. Nop it out during ftrace init. > > When ftrace is enabled, we want the 'stw' so that stack unwinding works > properly. Perform the same within the ftrace handler, similar to 64-bit > powerpc. > > For 64-bit powerpc, early versions of gcc used to emit a three > instruction sequence for function profiling (with -mprofile-kernel) with > a 'std' instruction to mimic the 'stw' above. Address that scenario also > by nop-ing out the 'std' instruction during ftrace init. Cool! Could 32-bit use the 2-insn sequence as well if it had -mprofile-kernel, out of curiosity? > > Signed-off-by: Naveen N Rao <naveen@xxxxxxxxxx> > --- > arch/powerpc/kernel/trace/ftrace.c | 6 ++++-- > arch/powerpc/kernel/trace/ftrace_entry.S | 4 ++-- > 2 files changed, 6 insertions(+), 4 deletions(-) > > diff --git a/arch/powerpc/kernel/trace/ftrace.c b/arch/powerpc/kernel/trace/ftrace.c > index d8d6b4fd9a14..463bd7531dc8 100644 > --- a/arch/powerpc/kernel/trace/ftrace.c > +++ b/arch/powerpc/kernel/trace/ftrace.c > @@ -241,13 +241,15 @@ int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec) > /* Expected sequence: 'mflr r0', 'stw r0,4(r1)', 'bl _mcount' */ > ret = ftrace_validate_inst(ip - 8, ppc_inst(PPC_RAW_MFLR(_R0))); > if (!ret) > - ret = ftrace_validate_inst(ip - 4, ppc_inst(PPC_RAW_STW(_R0, _R1, 4))); > + ret = ftrace_modify_code(ip - 4, ppc_inst(PPC_RAW_STW(_R0, _R1, 4)), > + ppc_inst(PPC_RAW_NOP())); > } else if (IS_ENABLED(CONFIG_MPROFILE_KERNEL)) { > /* Expected sequence: 'mflr r0', ['std r0,16(r1)'], 'bl _mcount' */ > ret = ftrace_read_inst(ip - 4, &old); > if (!ret && !ppc_inst_equal(old, ppc_inst(PPC_RAW_MFLR(_R0)))) { > ret = ftrace_validate_inst(ip - 8, ppc_inst(PPC_RAW_MFLR(_R0))); > - ret |= ftrace_validate_inst(ip - 4, ppc_inst(PPC_RAW_STD(_R0, _R1, 16))); > + ret |= ftrace_modify_code(ip - 4, ppc_inst(PPC_RAW_STD(_R0, _R1, 16)), > + ppc_inst(PPC_RAW_NOP())); So this is the old style path... Should you check the mflr validate result first? Also do you know what GCC version, roughly? Maybe we could have a comment here and eventually deprecate it. You could split this change into its own patch. > } > } else { > return -EINVAL; > diff --git a/arch/powerpc/kernel/trace/ftrace_entry.S b/arch/powerpc/kernel/trace/ftrace_entry.S > index 76dbe9fd2c0f..244a1c7bb1e8 100644 > --- a/arch/powerpc/kernel/trace/ftrace_entry.S > +++ b/arch/powerpc/kernel/trace/ftrace_entry.S > @@ -33,6 +33,8 @@ > * and then arrange for the ftrace function to be called. > */ > .macro ftrace_regs_entry allregs > + /* Save the original return address in A's stack frame */ > + PPC_STL r0, LRSAVE(r1) > /* Create a minimal stack frame for representing B */ > PPC_STLU r1, -STACK_FRAME_MIN_SIZE(r1) > > @@ -44,8 +46,6 @@ > SAVE_GPRS(3, 10, r1) > > #ifdef CONFIG_PPC64 > - /* Save the original return address in A's stack frame */ > - std r0, LRSAVE+SWITCH_FRAME_SIZE+STACK_FRAME_MIN_SIZE(r1) > /* Ok to continue? */ > lbz r3, PACA_FTRACE_ENABLED(r13) > cmpdi r3, 0 That seems right to me. Reviewed-by: Nicholas Piggin <npiggin@xxxxxxxxx>