On Wed, Apr 03, 2019 at 03:48:43AM +0100, Mark Rutland wrote: > Hi Torsten, > > Sorry for the long delay prior to this reply. I was hoping you would come up with some code to speed things up :( (For the record, v8 was the latest I sent, but nothing in the locations mentioned here has changed) > On Fri, Jan 18, 2019 at 05:39:08PM +0100, Torsten Duwe wrote: > > --- a/arch/arm64/include/asm/ftrace.h > > +++ b/arch/arm64/include/asm/ftrace.h > > @@ -14,9 +14,24 @@ > > #include <asm/insn.h> > > > > #define HAVE_FUNCTION_GRAPH_FP_TEST > > -#define MCOUNT_ADDR ((unsigned long)_mcount) > > #define MCOUNT_INSN_SIZE AARCH64_INSN_SIZE > > > > +/* > > + * DYNAMIC_FTRACE_WITH_REGS is implemented by adding 2 NOPs at the beginning > > + * of each function, with the second NOP actually calling ftrace. In contrary > > + * to a classic _mcount call, the call instruction to be modified is thus > > + * the second one, and not the only one. > > + */ > > +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS > > +#define ARCH_SUPPORTS_FTRACE_OPS 1 > > +#define REC_IP_BRANCH_OFFSET AARCH64_INSN_SIZE > > +/* All we need is some magic value. Simply use "_mCount:" */ > > +#define MCOUNT_ADDR (0x5f6d436f756e743a) > > I'm really not keen on having a magic constant, and I'd really like to > see MCOUNT_ADDR disappear entirely when the compiler doesn't generate an > mcount call. I'm concerned that this is confusing and fragile. Confusing: agreed. Fragile? don't think so. > I think that it would be better to make the core ftrace API agnostic of > mcount, and to teach it that there's a one-time initialization step for > callsites (which is not necessarily patching a call with a NOP). > > We currently have: > > ftrace_make_nop(mod, rec, addr) > ftrace_make_call(rec, addr) > ftrace_modify_call(rec, old_addr, new_addr) > > ... whereas we could have: > > ftrace_call_init(mod, rec) > ftrace_call_enable(rec, addr) > ftrace_call_disable(rec, addr) > ftrace_call_modify(rec, old_addr, new_addr) > > ... so we wouldn't need to special-case anything for initialization (and > don't need MCOUNT_ADDR at all), and it would be clearer as to what was > happening at each stage. I'm fully on your side here, BUT... This is the dynamic ftrace with regs implementation for aarch64 with the _current_ dynamic ftrace API. Changing the latter is IMO a different issue. This implementation has been tested, up to the point of some preliminary live patching. I propose to commit this and only then change the API for aarch64 and s390 simultaneously, avoiding breakage on ppc64le and x86, of course. (others to be investigated) > > + > > + /* The program counter just after the ftrace call site */ > > + str lr, [sp, #S_PC] > > For consistency with the existing ftrace assembly, please use 'x30' > rather than 'lr'. You could have raised that concern already along with the "fp" issue for v6. I don't have a strong preference here; personally I find fp and lr more readable, but with x29 and x30 one knows where they go. This is only just a waste of time. > > - module_disable_ro(mod); > > - *mod->arch.ftrace_trampoline = trampoline; > > - module_enable_ro(mod, true); > > + /* Check against our well-known list of ftrace entry points */ > > + if (addr == FTRACE_ADDR || addr == FTRACE_REGS_ADDR) { > > These checks exist within install_ftrace_trampoline(), so we don't need > to duplicate them here. True. Code evolution at work. Any other opinions on the dynamic ftrace API change, anyone? Torsten