On Fri, 29 Nov 2019, Vasily Gorbik wrote: > On Wed, Nov 06, 2019 at 10:56:01AM +0100, Miroslav Benes wrote: > > The livepatch consistency model requires reliable stack tracing > > architecture support in order to work properly. In order to achieve > > this, two main issues have to be solved. First, reliable and consistent > > call chain backtracing has to be ensured. Second, the unwinder needs to > > be able to detect stack corruptions and return errors. > > I tried to address that in a patch series I pushed here: > https://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git/log/?h=livepatch > > It partially includes your changes from this patch which have been split > in 2 patches: > s390/unwind: add stack pointer alignment sanity checks > s390/livepatch: Implement reliable stack tracing for the consistency model > > The primary goal was to make our backchain unwinder reliable itself. And > suitable for livepatching as is (we extra checks at the caller, see > below). Besides unwinder changes few things have been improved to avoid > special handling during unwinding. > - all tasks now have pt_regs at the bottom of task stack. > - final backchain always contains 0, no further references to no_dat stack. > - successful unwinding means reaching pt_regs at the bottom of task stack, > and unwinder guarantees that unwind_error() reflects that. > - final pt_regs at the bottom of task stack is never included in unwinding > results. It never was for user tasks. And kernel tasks cannot return to > that state anyway (and in some cases pt_regs are empty). > - unwinder starts unwinding from a reliable state (not "sp" passed as > an argument). Great. Thanks for doing that. It all looks good to me and the outcome is definitely better. I obviously still had some misconceptions about the code and it is much clearer now. > There is also s390 specific unwinder testing module. > > > Idle tasks are a bit special. Their final back chains point to no_dat > > stacks. See for reference CALL_ON_STACK() in smp_start_secondary() > > callback used in __cpu_up(). The unwinding is stopped there and it is > > not considered to be a stack corruption. > I changed that with commit: > s390: avoid misusing CALL_ON_STACK for task stack setup > Now idle tasks are not special, final back chain contains 0. > > > --- > > arch/s390/Kconfig | 1 + > > arch/s390/kernel/dumpstack.c | 11 +++++++ > > arch/s390/kernel/stacktrace.c | 46 ++++++++++++++++++++++++++ > > arch/s390/kernel/unwind_bc.c | 61 +++++++++++++++++++++++++++-------- > > 4 files changed, 106 insertions(+), 13 deletions(-) > > > > --- a/arch/s390/kernel/dumpstack.c > > +++ b/arch/s390/kernel/dumpstack.c > > @@ -94,12 +94,23 @@ int get_stack_info(unsigned long sp, struct task_struct *task, > > if (!sp) > > goto unknown; > > > > + /* Sanity check: ABI requires SP to be aligned 8 bytes. */ > > + if (sp & 0x7) > > + goto unknown; > > + > This has been split in a separate commit: > s390/unwind: add stack pointer alignment sanity checks > > > + /* > > + * The reliable unwinding should not start on nodat_stack, async_stack > > + * or restart_stack. The task is either current or must be inactive. > > + */ > > + if (unwind_reliable) > > + goto unknown; > I moved this check in arch_stack_walk_reliable itself. See below. > > > static bool unwind_use_regs(struct unwind_state *state) > > @@ -85,10 +94,13 @@ static bool unwind_use_frame(struct unwind_state *state, unsigned long sp, > > struct stack_frame *sf; > > unsigned long ip; > > > > - if (unlikely(outside_of_stack(state, sp))) { > > - if (!update_stack_info(state, sp)) > > - goto out_err; > > - } > > + /* > > + * No need to update stack info when unwind_reliable is true. We should > > + * be on a task stack and everything else is an error. > > + */ > > + if (unlikely(outside_of_stack(state, sp)) && > > + (unwind_reliable || !update_stack_info(state, sp))) > > + goto out_err; > I moved this check in arch_stack_walk_reliable itself. See below. > > > + /* Unwind reliable */ > > + if ((unsigned long)regs != info->end - sizeof(struct pt_regs)) > > + goto out_err; > I moved this check in arch_stack_walk_reliable itself. See below. > > > > @@ -136,8 +162,17 @@ bool unwind_next_frame(struct unwind_state *state, bool unwind_reliable) > > sf = (struct stack_frame *) state->sp; > > sp = READ_ONCE_NOCHECK(sf->back_chain); > > > > - /* Non-zero back-chain points to the previous frame */ > > - if (likely(sp)) > > + /* > > + * Non-zero back-chain points to the previous frame > > + * > > + * unwind_reliable case: Idle tasks are special. The final > > + * back-chain points to nodat_stack. See CALL_ON_STACK() in > > + * smp_start_secondary() callback used in __cpu_up(). We just > > + * accept it and look for pt_regs. > > + */ > > + if (likely(sp) && > > + (!unwind_reliable || !(is_idle_task(state->task) && > > + outside_of_stack(state, sp)))) > > return unwind_use_frame(state, sp, unwind_reliable); > This is not needed anymore. > > In the end this all boils down to arch_stack_walk_reliable > implementation. I made the following changes compaired to your version: > --- > - use plain unwind_for_each_frame > - "state.stack_info.type != STACK_TYPE_TASK" guarantees that we are > not leaving task stack. > - "if (state.regs)" guarantees that we have not met an program interrupt > pt_regs (page faults) or preempted. Corresponds to yours: > > + if ((unsigned long)regs != info->end - sizeof(struct pt_regs)) > > + goto out_err; Agreed. All this should be equivalent to the changes I made. > - I don't see a point in storing "kernel_thread_starter", am I missing > something? No, you don't. It was just for the sake of completeness and it is not worth it. [...] > I'd appreciate if you could give those changes a spin. And check if > something is missing or broken. Or share your livepatching testing > procedure. If everything goes well, I might include these changes in > second pull request for this 5.5 merge window. > > I did successfully run ./testing/selftests/livepatch/test-livepatch.sh 'make run_tests' in tools/testing/selftests/livepatch/ would call all the tests in the directory. Especially test-callbacks.sh which stresses the livepatching even more. > https://github.com/lpechacek/qa_test_klp seems outdated. I was able to > fix and run some tests of it but haven't had time to figure out all of > them. Is there a version that would run on top of current Linus tree? Ah, sorry. I should have mentioned that. The code became outdated with recent upstream changes. Libor is working on the updates (CCed). Anyway, I ran it here and it all works fine. Tested-by: Miroslav Benes <mbenes@xxxxxxx> Thanks a lot Miroslav