Hi Torsten, On 04/01/2019 14:10, Torsten Duwe wrote: > Use -fpatchable-function-entry (gcc8) to add 2 NOPs at the beginning > of each function. Replace the first NOP thus generated with a quick LR > saver (move it to scratch reg x9), so the 2nd replacement insn, the call > to ftrace, does not clobber the value. Ftrace will then generate the > standard stack frames. > > Note that patchable-function-entry in GCC disables IPA-RA, which means > ABI register calling conventions are obeyed *and* scratch registers > such as x9 are available. > > Introduce and handle an ftrace_regs_trampoline for module PLTs, right > after ftrace_trampoline, and double the size of this special section. > > Signed-off-by: Torsten Duwe <duwe@xxxxxxx> > I wanted to test this patch (and try to benchmark having the "mov x9, x30" always present in function prelude vs having two nops), but I cannot get this patch to apply (despite having a version including both commits below). Could you provide a git branch from which I could try to rebase the patch? (Or a new version of the series) > --- > > This patch applies on 4.20 with the additional changes > bdb85cd1d20669dfae813555dddb745ad09323ba > (arm64/module: switch to ADRP/ADD sequences for PLT entries) > and > 7dc48bf96aa0fc8aa5b38cc3e5c36ac03171e680 > (arm64: ftrace: always pass instrumented pc in x0) > along with their respective series, or alternatively on Linus' master, > which already has these. > > changes since v5: > > * fix mentioned pc in x0 to hold the start address of the call site, > not the return address or the branch address. > This resolves the problem found by Amit. > > --- > arch/arm64/Kconfig | 2 > arch/arm64/Makefile | 4 + > arch/arm64/include/asm/assembler.h | 1 > arch/arm64/include/asm/ftrace.h | 13 +++ > arch/arm64/include/asm/module.h | 3 > arch/arm64/kernel/Makefile | 6 - > arch/arm64/kernel/entry-ftrace.S | 131 ++++++++++++++++++++++++++++++++++ > arch/arm64/kernel/ftrace.c | 125 ++++++++++++++++++++++++-------- > arch/arm64/kernel/module-plts.c | 3 > arch/arm64/kernel/module.c | 2 > drivers/firmware/efi/libstub/Makefile | 3 > include/asm-generic/vmlinux.lds.h | 1 > include/linux/compiler_types.h | 4 + > 13 files changed, 262 insertions(+), 36 deletions(-) [...] > --- a/arch/arm64/kernel/entry-ftrace.S > +++ b/arch/arm64/kernel/entry-ftrace.S [...] > @@ -122,6 +124,7 @@ skip_ftrace_call: // } > ENDPROC(_mcount) > > #else /* CONFIG_DYNAMIC_FTRACE */ > +#ifndef CONFIG_DYNAMIC_FTRACE_WITH_REGS > /* > * _mcount() is used to build the kernel with -pg option, but all the branch > * instructions to _mcount() are replaced to NOP initially at kernel start up, > @@ -159,6 +162,124 @@ GLOBAL(ftrace_graph_call) // ftrace_gra > > mcount_exit > ENDPROC(ftrace_caller) > +#else /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */ > + > +/* > + * Since no -pg or similar compiler flag is used, there should really be > + * no reference to _mcount; so do not define one. Only some value for > + * MCOUNT_ADDR is needed for comparison. Let it point here to have some > + * sort of magic value that can be recognised when debugging. > + */ > +GLOBAL(_mcount) > + ret /* make it differ from regs caller */ There's something I can't figure out. Since there are no callers to _mcount, how does the ftrace core builds up its record of patchable functions? I don't understand fully the core ftrace code but I've got the impression that without this record of struct dyn_ftrace, ftrace cannot patch in calls to tracers in the future. Am I missing something? Thanks, -- Julien Thierry