On Mon, Jan 14, 2013 at 5:12 PM, David Daney <ddaney.cavm@xxxxxxxxx> wrote: > On 01/14/2013 01:10 PM, Alan Cooper wrote: >> >> I already tried using "adddiu sp, sp, 8" and it caused the kernel to >> randomly crash. After many hours of debugging the reason occurred to >> me while in bed in the middle of the night. The problem is that if we >> get an interrupt between the add 8 and the add -8 instructions, we >> trash the existing stack. >> >> The problem with the 2 nop approach is that there are a series of >> subroutines used to write each nop and these nested subroutines are >> traceable. > > > This seems like a bug. The low-level code used to do code patching probably > should be CFLAGS_REMOVE_file.o = -pg While tracing mcount cannot be done because it's recursive, allowing tracing of the code to enable/disable the call to mcount can be done and seems useful. Also, fixing the 2 nop solution this way will still not allow us to stop using stop_machine() which is hugely disruptive to a running system. Remember that when tracing is enabled and disabled we end up modifying 20 to 30 thousand functions. Moving this functionality out of stop_machine() seems like a big benefit. > > > >> This means on the second call to these subroutines they >> execute with only one nop and crash. I could write some new code >> that wrote the 2 nops at once, but (now that I understand >> "stop_machine") with the branch likely solution we should be able to >> stop using "stop_machine" when we write nops to the 20-30 thousand >> Linux functions. It looks like other platforms have stopped using >> stop_machine. > > > I don't particularly object to the 'branch likely solution', but I think the > failures of the other approaches indicates underlying bugs in the tracing > code. Those bugs should probably be fixed. If a solution can be found that modifies a single 32bit instruction to enable/disable tracing, I don't see any bugs in the underlying code. Plus we can avoid using stop_machine(). > > David Daney > > > >> >> Al >> >> On Fri, Jan 11, 2013 at 12:01 PM, David Daney <ddaney.cavm@xxxxxxxxx> >> wrote: >>> >>> On 01/11/2013 06:33 AM, Al Cooper wrote: >>>> >>>> >>>> Function tracing is currently broken for all 32 bit MIPS platforms. >>>> When tracing is enabled, the kernel immediately hangs on boot. >>>> This is a result of commit b732d439cb43336cd6d7e804ecb2c81193ef63b0 >>>> that changes the kernel/trace/Kconfig file so that is no longer >>>> forces FRAME_POINTER when FUNCTION_TRACING is enabled. >>>> >>>> MIPS frame pointers are generally considered to be useless because >>>> they cannot be used to unwind the stack. Unfortunately the MIPS >>>> function tracing code has bugs that are masked by the use of frame >>>> pointers. This commit fixes the bugs so that MIPS frame pointers do >>>> not need to be enabled. >>>> >>>> The bugs are a result of the odd calling sequence used to call the trace >>>> routine. This calling sequence is inserted into every tracable function >>>> when the tracing CONFIG option is enabled. This sequence is generated >>>> for 32bit MIPS platforms by the compiler via the "-pg" flag. >>>> Part of the sequence is "addiu sp,sp,-8" in the delay slot after every >>>> call to the trace routine "_mcount" (some legacy thing where 2 arguments >>>> used to be pushed on the stack). The _mcount routine is expected to >>>> adjust the sp by +8 before returning. >>>> >>>> One of the bugs is that when tracing is disabled for a function, the >>>> "jalr _mcount" instruction is replaced with a nop, but the >>>> "addiu sp,sp,-8" is still executed and the stack pointer is left >>>> trashed. When frame pointers are enabled the problem is masked >>>> because any access to the stack is done through the frame >>>> pointer and the stack pointer is restored from the frame pointer when >>>> the function returns. This patch uses a branch likely instruction >>>> "bltzl zero, f1" instead of "nop" to disable the call because this >>>> instruction will not execute the "addiu sp,sp,-8" instruction in >>>> the delay slot. The other possible solution would be to nop out both >>>> the jalr and the "addiu sp,sp,-8", but this would need to be interrupt >>>> and SMP safe and would be much more intrusive. >>> >>> >>> >>> I thought all CPUs were in stop_machine() when the modifications were >>> done, >>> so that there is no issue with multi-word instruction patching. >>> >>> Am I wrong about this? >>> >>> So really I think you can do two NOP just as easily. >>> >>> The only reason I bring this up is that I am not sure all 32-bit CPUs >>> implement the 'Likely' branch variants. Also there may be an affect on >>> the >>> branch predictor. >>> >>> A third possibility would be to replace the JALR with 'ADDIU SP,SP,8' >>> That >>> way the following adjustment would be canceled out. This would work well >>> on >>> single-issue CPUs, but the instructions may not be able to dual-issue on >>> a >>> multi issue machine due to data dependencies. >>> >>> David Daney >>> >>> >>>> >>>> A few other bugs were fixed where the _mcount routine itself did not >>>> always fix the sp on return. >>>> >>> >> >> >