On Tue, 2019-01-22 at 15:57:21 UTC, Joe Lawrence wrote: > From: Nicolai Stange <nstange@xxxxxxx> > > The ppc64 specific implementation of the reliable stacktracer, > save_stack_trace_tsk_reliable(), bails out and reports an "unreliable > trace" whenever it finds an exception frame on the stack. Stack frames > are classified as exception frames if the STACK_FRAME_REGS_MARKER magic, > as written by exception prologues, is found at a particular location. > > However, as observed by Joe Lawrence, it is possible in practice that > non-exception stack frames can alias with prior exception frames and thus, > that the reliable stacktracer can find a stale STACK_FRAME_REGS_MARKER on > the stack. It in turn falsely reports an unreliable stacktrace and blocks > any live patching transition to finish. Said condition lasts until the > stack frame is overwritten/initialized by function call or other means. > > In principle, we could mitigate this by making the exception frame > classification condition in save_stack_trace_tsk_reliable() stronger: > in addition to testing for STACK_FRAME_REGS_MARKER, we could also take into > account that for all exceptions executing on the kernel stack > - their stack frames's backlink pointers always match what is saved > in their pt_regs instance's ->gpr[1] slot and that > - their exception frame size equals STACK_INT_FRAME_SIZE, a value > uncommonly large for non-exception frames. > > However, while these are currently true, relying on them would make the > reliable stacktrace implementation more sensitive towards future changes in > the exception entry code. Note that false negatives, i.e. not detecting > exception frames, would silently break the live patching consistency model. > > Furthermore, certain other places (diagnostic stacktraces, perf, xmon) > rely on STACK_FRAME_REGS_MARKER as well. > > Make the exception exit code clear the on-stack STACK_FRAME_REGS_MARKER > for those exceptions running on the "normal" kernel stack and returning > to kernelspace: because the topmost frame is ignored by the reliable stack > tracer anyway, returns to userspace don't need to take care of clearing > the marker. > > Furthermore, as I don't have the ability to test this on Book 3E or > 32 bits, limit the change to Book 3S and 64 bits. > > Finally, make the HAVE_RELIABLE_STACKTRACE Kconfig option depend on > PPC_BOOK3S_64 for documentation purposes. Before this patch, it depended > on PPC64 && CPU_LITTLE_ENDIAN and because CPU_LITTLE_ENDIAN implies > PPC_BOOK3S_64, there's no functional change here. > > Fixes: df78d3f61480 ("powerpc/livepatch: Implement reliable stack tracing for the consistency model") > Reported-by: Joe Lawrence <joe.lawrence@xxxxxxxxxx> > Signed-off-by: Nicolai Stange <nstange@xxxxxxx> > Signed-off-by: Joe Lawrence <joe.lawrence@xxxxxxxxxx> Series applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/eddd0b332304d554ad6243942f87c2fc cheers