Nicolai Stange <nstange@xxxxxxx> writes: > Michael Ellerman <mpe@xxxxxxxxxxxxxx> writes: > >> Joe Lawrence <joe.lawrence@xxxxxxxxxx> writes: >>> From: Nicolai Stange <nstange@xxxxxxx> >>> >>> The ppc64 specific implementation of the reliable stacktracer, >>> save_stack_trace_tsk_reliable(), bails out and reports an "unreliable >>> trace" whenever it finds an exception frame on the stack. Stack frames >>> are classified as exception frames if the STACK_FRAME_REGS_MARKER magic, >>> as written by exception prologues, is found at a particular location. >>> >>> However, as observed by Joe Lawrence, it is possible in practice that >>> non-exception stack frames can alias with prior exception frames and thus, >>> that the reliable stacktracer can find a stale STACK_FRAME_REGS_MARKER on >>> the stack. It in turn falsely reports an unreliable stacktrace and blocks >>> any live patching transition to finish. Said condition lasts until the >>> stack frame is overwritten/initialized by function call or other means. >>> >>> In principle, we could mitigate this by making the exception frame >>> classification condition in save_stack_trace_tsk_reliable() stronger: >>> in addition to testing for STACK_FRAME_REGS_MARKER, we could also take into >>> account that for all exceptions executing on the kernel stack >>> - their stack frames's backlink pointers always match what is saved >>> in their pt_regs instance's ->gpr[1] slot and that >>> - their exception frame size equals STACK_INT_FRAME_SIZE, a value >>> uncommonly large for non-exception frames. >>> >>> However, while these are currently true, relying on them would make the >>> reliable stacktrace implementation more sensitive towards future changes in >>> the exception entry code. Note that false negatives, i.e. not detecting >>> exception frames, would silently break the live patching consistency model. >>> >>> Furthermore, certain other places (diagnostic stacktraces, perf, xmon) >>> rely on STACK_FRAME_REGS_MARKER as well. >>> >>> Make the exception exit code clear the on-stack STACK_FRAME_REGS_MARKER >>> for those exceptions running on the "normal" kernel stack and returning >>> to kernelspace: because the topmost frame is ignored by the reliable stack >>> tracer anyway, returns to userspace don't need to take care of clearing >>> the marker. >>> >>> Furthermore, as I don't have the ability to test this on Book 3E or >>> 32 bits, limit the change to Book 3S and 64 bits. >>> >>> Finally, make the HAVE_RELIABLE_STACKTRACE Kconfig option depend on >>> PPC_BOOK3S_64 for documentation purposes. Before this patch, it depended >>> on PPC64 && CPU_LITTLE_ENDIAN and because CPU_LITTLE_ENDIAN implies >>> PPC_BOOK3S_64, there's no functional change here. >> >> That has nothing to do with the fix and should really be in a separate >> patch. >> >> I can split it when applying. > > If you don't mind, that would be nice! Or simply drop that > chunk... Otherwise, let me know if I shall send a split v2 for this > patch [1/4] only. No worries, I split it out: commit a50d3250d7ae34c561177a1f9cfb79816fcbcff1 Author: Nicolai Stange <nstange@xxxxxxx> AuthorDate: Thu Jan 31 16:41:50 2019 +1100 Commit: Michael Ellerman <mpe@xxxxxxxxxxxxxx> CommitDate: Thu Jan 31 16:43:29 2019 +1100 powerpc/64s: Make reliable stacktrace dependency clearer Make the HAVE_RELIABLE_STACKTRACE Kconfig option depend on PPC_BOOK3S_64 for documentation purposes. Before this patch, it depended on PPC64 && CPU_LITTLE_ENDIAN and because CPU_LITTLE_ENDIAN implies PPC_BOOK3S_64, there's no functional change here. Signed-off-by: Nicolai Stange <nstange@xxxxxxx> Signed-off-by: Joe Lawrence <joe.lawrence@xxxxxxxxxx> [mpe: Split out of larger patch] Signed-off-by: Michael Ellerman <mpe@xxxxxxxxxxxxxx> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 2890d36eb531..73bf87b1d274 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -220,7 +220,7 @@ config PPC select HAVE_PERF_USER_STACK_DUMP select HAVE_RCU_TABLE_FREE if SMP select HAVE_REGS_AND_STACK_ACCESS_API - select HAVE_RELIABLE_STACKTRACE if PPC64 && CPU_LITTLE_ENDIAN + select HAVE_RELIABLE_STACKTRACE if PPC_BOOK3S_64 && CPU_LITTLE_ENDIAN select HAVE_SYSCALL_TRACEPOINTS select HAVE_VIRT_CPU_ACCOUNTING select HAVE_IRQ_TIME_ACCOUNTING cheers