Hi, On Thu, 2010-03-11 at 10:49 -0800, David Daney wrote: [...] > > +/* > > + * If the Instruction Pointer is in module space (0xc0000000), return ture; > > s/ture/true/ > yeah. > > + * otherwise, it is in kernel space (0x80000000), return false. > > + */ > > +#define in_module(ip) (unlikely((ip)& 0x40000000)) > > + > > This isn't universally true, but it does hold for most configurations I > think. Although I'm not sure who is the exception, we always need an universal solution, what about this: Compare module with kernel: module: <saving registers> lui v1, hi16_mcount <--- ip addiu v1, v1, lo16_mcount move at, ra jalr v1 nop kernel: <saving registers> move at, ra jal _mcount <--- ip The above _ip_ is the address have been recorded into the __mcount_loc section of the kernel by scripts/recordmcount.pl, as we can see, for kernel, the *(ip - 4) is "move at, ra": 03e0082d, a certain instruction, but for module, there is no possibility(?) of existing a "move at, ra" at *(ip -4) but a register saving operation("s {d,w} rs, offset(sp)", prefixed by 0xffb0 for 64bit and 0xafb0 for 32bit. ), and reversly, for kernel, there is no such instruction there. And consider the new option -mmcount-ra-address of gcc, some more instructions will be inserted between "move at, ra" and the calling site to mcount, so, *(ip-4) will not always be "move at, ra", then we need to check if there is a "s {d,w} rs, offset(sp)" there, if yes, it is in module, otherwise, it should be in kernel. #define S_RS_SP 0xafb00000 /* s{d,w} rs, offset(sp) */ static inline int in_module(ip) { insn = *(ip - 4); /* need to use safe_load_code instead, what about big endian? */ return ((insn & S_RS_SP) == S_RS_SP) } > > [...] > > > + /* > > + * We have compiled modules with -mlong-calls, but compiled kernel > > + * without it, therefore, need to cope with them respectively. > > + * > > + * For module: > > + * > > + * lui v1, hi16_mcount --> b 1f > > + * addiu v1, v1, lo16_mcount > > + * move at, ra > > + * jalr v1 > > + * nop > > + * 1f: (ip + 16) > > > Have you thought about just overwriting the jalr here instead of > branching around it? In any event, I don't think you can count on a > fixed size code sequence for calling _mcount. We are passing the > address of the save location of RA to _mcount too. The size of the code > will depend on the size of the functions stack frame *and* weather or > not it is a leaf function. Although in the kernel we are unlikely to > see functions with large stack frames. So even with "b 1f", we need to use the right offset, the original version for module with -mmcount-ra-address should have bugs here for the offset should be 16 + 8 or 4 (two instructions for leaf function, one instruction for non-leaf function). but for we only recorded the position of "lui v1, hi16_mcount" in the __mcount_loc section, so we need to search the position of the real calling site of mcount(jalr v1), this will goes to what you have suggested below. > > > > + * For kernel: > > + * > > + * move at, ra > > + * jal _mcount --> nop > > + * > > + */ > > + new = in_module(ip) ? INSN_B_1F : INSN_NOP; > > > What would happen if you read the code to find the first JAL or JALR, > and then overwrote it with a NOP instead of relying on the function > address to figure out which type of prolog it has? > > The reason I suggest this is that sometimes we place the entire kernel > in CSSEG. When this is done, everything has the same (short) _mcount > calling sequence. Right, then, we can search the JAL or JALR, for kernel, will get it immediatly, for module, will only several instructions, we can do this searching in ftrace_make_nop and ftrace_make_call at run-time, but just found we can use the following function to do it in ftrace_init(), looks good. static inline int is_call_mcount(unsigned int insn) { return ((insn & JAL) == JAL) || (insn == JALR_V1); } static inline unsinged long mcount_callsite(unsigned long addr) { unsigned int insn; insn = *(unsigned int *)addr; /*need safe_load_code*/ if (is_call_mcount(insn)) return addr; do { addr += 4; /* what about big endian? */ insn = *(unsigned int *)addr; /*need safe_load_code*/ } while (!is_call_mcount(insn)); return addr; } static inline unsigned long ftrace_call_adjust(unsigned long addr) { return mcount_callsite(addr); } With the above support, we only need this new ftrace_make_nop: *(unsigned int *)ip = INSN_NOP; (But for module, this may need more overhead than "b 1f". ) and ftrace_make_call: *(unsigned int *)ip = in_module(ip) : INSN_JALR_V1 : insn_jal_mcount; (And here, for module, we need more time to determine which space we are.) Any more suggestion? Thanks & Regards, Wu Zhangjin