On Wed, Jun 03, 2020 at 10:06:07PM +0800, Wangshaobo (bobo) wrote: > Today I test your fix, but arch_stack_walk_reliable() still return failed > sometimes, I > > found one of three scenarios mentioned failed: > > > 1. user task (just fork) but not been scheduled > > test FAILED > > it is because unwind_next_frame() get the first frame, this time > state->signal is false, and then return > > failed in the same place for ret_from_fork has not executed at all. Oops - I meant to do it in __unwind_start (as you discovered). > 2. user task (just fork) start excuting ret_from_fork() till schedule_tail > but not UNWIND_HINT_REGS > > test condition :loop fork() in current system > > result : SUCCESS, > > it looks like this modification works for my perspective : > > - /* Success path for non-user tasks, i.e. kthreads and idle tasks */ > - if (!(task->flags & (PF_KTHREAD | PF_IDLE))) > - return -EINVAL; > but is this possible to miss one invalid judgement condition ? (1) I'm not sure I understand your question, but I think this change shouldn't break anything else because the ORC unwinder is careful to always set an error if it doesn't reach the end of the stack, especially with my recent ORC fixes which went into 5.7. > 3. call_usermodehelper_exec_async > > test condition :loop call call_usermodehelper() in a module selfmade. > > result : SUCCESS, > > it looks state->signal==true works when unwind_next_frame() gets the end > ret_from_fork() frame, > > but i don't know how does it work, i am confused by this sentences, how > does the comment means sibling calls and > > calls to noreturn functions? (2) > > /* > * Find the orc_entry associated with the text address. > * > * Decrement call return addresses by one so they work for sibling > * calls and calls to noreturn functions. > */ > orc = orc_find(state->signal ? state->ip : state->ip - 1); > if (!orc) { To be honest, I don't remember what I meant by sibling calls. They don't even leave anything on the stack. For noreturns, the code might be laid out like this: func1: ... call noreturn_foo func2: func2 is immediately after the call to noreturn_foo. So the return address on the stack will actually be 'func2'. We want to retrieve the ORC data for the call instruction (inside func1), instead of the instruction at the beginning of func2. I should probably update that comment. -- Josh