Hi Tao, On Thu, Mar 07, 2024 at 08:52:40PM +0800, Tao Liu wrote: > Previously we can only view the stack unwinding for the tasks which are > running on each CPUs. This patch will enable the ability to view > arbitrary tasks stack unwinding. > > After crash get initialized, "info threads" will output like the > following: > > crash> info threads > Id Target Id Frame > 1 CPU 0 native_safe_halt () at arch/x86/include/asm/irqflags.h:54 > ... > * 8 CPU 7 blk_mq_rq_timed_out (req=0xffff880fdb246000, reserved=reserved@entry=false) at block/blk-mq.c:640 > ... > 13 CPU 12 <unavailable> in ?? () > 14 CPU 13 native_safe_halt () at arch/x86/include/asm/irqflags.h:54 > ... > > crash> ps > PID PPID CPU TASK ST %MEM VSZ RSS COMM > > 0 0 0 ffffffff819f9480 RU 0.0 0 0 [swapper/0] > > 0 0 1 ffff880169411fa0 RU 0.0 0 0 [swapper/1] > ... > 0 0 23 ffff8801694e0000 RU 0.0 0 0 [swapper/23] > 1 0 13 ffff880169b30000 IN 0.0 193052 4180 systemd > > "info threads" show the tasks which are currently running on each CPU. If we'd > like to view systemd task's stack unwinding, which is inactive status, we > do the following: > > crash> set 1 > or > crash> set ffff880169b30000 > > Then the register cache of systemd will be swapped into CPU 13: > > crash> info threads > Id Target Id Frame > 1 CPU 0 native_safe_halt () at arch/x86/include/asm/irqflags.h:54 > ... > 8 CPU 7 blk_mq_rq_timed_out (req=0xffff880fdb246000, reserved=reserved@entry=false) at block/blk-mq.c:640 > ... > 13 CPU 12 <unavailable> in ?? () > * 14 CPU 13 0xffffffff816a8f65 in context_switch (rq=0x0, next=0x0, prev=0xffff880169b30000) at kernel/sched/core.c:2527 > ... > > And we can view the stack unwinding of systemd: > > crash> bt > PID: 1 TASK: ffff880169b30000 CPU: 13 COMMAND: "systemd" > #0 [ffff880169b3bd58] __schedule at ffffffff816a8f65 > #1 [ffff880169b3bdc0] schedule at ffffffff816a94e9 > #2 [ffff880169b3bdd0] schedule_hrtimeout_range_clock at ffffffff816a86fd > #3 [ffff880169b3be68] schedule_hrtimeout_range at ffffffff816a8733 > #4 [ffff880169b3be78] ep_poll at ffffffff8124bb7e > #5 [ffff880169b3bf30] sys_epoll_wait at ffffffff8124d00d > #6 [ffff880169b3bf80] system_call_fastpath at ffffffff816b5009 > RIP: 00007f0449407923 RSP: 00007ffc35a3c378 RFLAGS: 00010246 > RAX: 00000000000000e8 RBX: ffffffff816b5009 RCX: 0000000000000071 > RDX: 000000000000001d RSI: 00007ffc35a3d5a0 RDI: 0000000000000004 > RBP: 00007ffc35a3d810 R8: 0000000000000000 R9: 0000000000000000 > R10: 00000000ffffffff R11: 0000000000000293 R12: 0000563ca2ebe980 > R13: 0000000000000003 R14: ffffffffffffffff R15: 0000000000000001 > ORIG_RAX: 00000000000000e8 CS: 0033 SS: 002b > crash> gdb bt > #0 0xffffffff816a8f65 in context_switch (rq=0x0, next=0x0, prev=0xffff880169b30000) at kernel/sched/core.c:2527 > #1 __schedule () at kernel/sched/core.c:3540 > #2 0xffffffff816a94e9 in schedule () at kernel/sched/core.c:3577 > #3 0xffffffff816a86fd in schedule_hrtimeout_range_clock (expires=expires@entry=0x0, delta=delta@entry=0, mode=mode@entry=HRTIMER_MODE_ABS, clock=clock@entry=1) at kernel/hrtimer.c:1724 > #4 0xffffffff816a8733 in schedule_hrtimeout_range (expires=expires@entry=0x0, delta=delta@entry=0, mode=mode@entry=HRTIMER_MODE_ABS) at kernel/hrtimer.c:1778 > #5 0xffffffff8124bb7e in ep_poll (ep=0xffff880fd861f8c0, events=events@entry=0x7ffc35a3d5a0, maxevents=maxevents@entry=29, timeout=timeout@entry=-1) at fs/eventpoll.c:1669 > #6 0xffffffff8124d00d in SYSC_epoll_wait (timeout=<optimized out>, maxevents=29, events=<optimized out>, epfd=<optimized out>) at fs/eventpoll.c:2043 > #7 SyS_epoll_wait (epfd=<optimized out>, events=140721208415648, maxevents=29, timeout=4294967295) at fs/eventpoll.c:2008 > #8 <signal handler called> > #9 0x00007f0449407923 in ?? () > > Signed-off-by: Tao Liu <ltao@xxxxxxxxxx> > --- > crash_target.c | 50 ++++++++++++++++------------- > defs.h | 3 +- > gdb-10.2.patch | 85 +------------------------------------------------- > kernel.c | 22 +++++++++++++ > task.c | 5 +-- > 5 files changed, 57 insertions(+), 108 deletions(-) > > diff --git a/crash_target.c b/crash_target.c > index d06383f..8efe2d6 100644 > --- a/crash_target.c > +++ b/crash_target.c > @@ -29,10 +29,10 @@ extern "C" int gdb_readmem_callback(unsigned long, void *, int, int); > extern "C" int crash_get_nr_cpus(void); > extern "C" int crash_get_cpu_reg (int cpu, int regno, const char *regname, > int regsize, void *val); > -extern "C" int gdb_change_cpu_context (unsigned int cpu); > extern "C" void gdb_refresh_regcache(unsigned int cpu); > extern "C" int set_cpu(int cpu, int print_context); > - > +extern "C" int crash_set_thread(ulong); > +extern "C" int gdb_change_thread_context (ulong); > > /* The crash target. */ > > @@ -110,11 +110,13 @@ crash_target::xfer_partial (enum target_object object, const char *annex, > > #define CRASH_INFERIOR_PID 1 > > +crash_target *target = NULL; > + > void > crash_target_init (void) > { > int nr_cpus = crash_get_nr_cpus(); > - crash_target *target = new crash_target (); > + target = new crash_target (); > > /* Own the target until it is successfully pushed. */ > target_ops_up target_holder (target); > @@ -137,29 +139,35 @@ crash_target_init (void) > reinit_frame_cache (); > } > > -/* > - * Change gdb's thread context to the thread on given CPU > - **/ > extern "C" int > -gdb_change_cpu_context(unsigned int cpu) > +gdb_change_thread_context (ulong task) > { > - ptid_t ptid = ptid_t(CRASH_INFERIOR_PID, 0, cpu); > - inferior *inf = current_inferior (); > - thread_info *tp = find_thread_ptid (inf, ptid); > - > - if (tp == nullptr) > + int tried = 0; > + inferior* inf = current_inferior(); > + int cpu = crash_set_thread(task); > + if (cpu < 0) > return FALSE; > > - /* Making sure that crash's context is same */ > - set_cpu(cpu, FALSE); > - > - /* Switch to the thread */ > - switch_to_thread(tp); > - > - /* Fetch/Refresh thread's registers */ > - gdb_refresh_regcache(cpu); > + ptid_t ptid = ptid_t(CRASH_INFERIOR_PID, 0, cpu); > > - return TRUE; > +retry: > + thread_info *tp = find_thread_ptid (inf, ptid); > + if (tp == nullptr && !tried) { > + thread_info *thread = add_thread_silent(target, > + ptid_t(CRASH_INFERIOR_PID, 0, cpu)); A minor comment. Is it possible to do it without needing 'target' as global ? I see there's 'inf->current_top_target()', but it's return type if target_ops, don't know if it can be used. > + tried++; > + if (thread) { > + goto retry; > + } > + } > + > + if (tp == nullptr && tried) > + return FALSE; > + > + target_fetch_registers(get_thread_regcache(tp), -1); > + switch_to_thread(tp); > + reinit_frame_cache (); > + return TRUE; > } > > /* Refresh regcache of gdb thread on given CPU > diff --git a/defs.h b/defs.h > index 06bd27d..1ddfc5e 100644 > --- a/defs.h > +++ b/defs.h > @@ -6083,6 +6083,7 @@ void sort_tgid_array(void); > int sort_by_tgid(const void *, const void *); > int in_irq_ctx(ulonglong, int, ulong); > void check_stack_overflow(void); > +int gdb_change_thread_context (ulong); > > /* > * extensions.c > @@ -6123,6 +6124,7 @@ void dump_kernel_table(int); > void dump_bt_info(struct bt_info *, char *where); > void dump_log(int); > void parse_kernel_version(char *); > +int crash_set_thread(ulong); > > #define LOG_LEVEL(v) ((v) & 0x07) > #define SHOW_LOG_LEVEL (0x1) > @@ -8191,7 +8193,6 @@ enum ppc64_regnum { > }; > > /* crash_target.c */ > -extern int gdb_change_cpu_context (unsigned int cpu); > extern void gdb_refresh_regcache (unsigned int cpu); > > #endif /* !GDB_COMMON */ > diff --git a/gdb-10.2.patch b/gdb-10.2.patch > index 138413f..3e6677d 100644 > --- a/gdb-10.2.patch > +++ b/gdb-10.2.patch > @@ -16058,35 +16058,6 @@ exit 0 > m10200-dis.c > m10200-opc.c > m10300-dis.c > ---- gdb-10.2/gdb/thread.c.orig > -+++ gdb-10.2/gdb/thread.c > -@@ -58,6 +58,11 @@ static int highest_thread_num; > - /* The current/selected thread. */ > - static thread_info *current_thread_; > - > -+#ifdef CRASH_MERGE > -+/* Function to set cpu, defined by crash-utility */ > -+extern "C" void set_cpu (int); > -+#endif > -+ > - /* RAII type used to increase / decrease the refcount of each thread > - in a given list of threads. */ > - > -@@ -1896,7 +1901,13 @@ thread_command (const char *tidstr, int from_tty) > - { > - ptid_t previous_ptid = inferior_ptid; > - > -- thread_select (tidstr, parse_thread_id (tidstr, NULL)); > -+ struct thread_info* thread_id = parse_thread_id (tidstr, NULL); > -+ > -+#ifdef CRASH_MERGE > -+ set_cpu(thread_id->ptid.tid()); > -+#endif > -+ > -+ thread_select (tidstr, thread_id); > - > - /* Print if the thread has not changed, otherwise an event will > - be sent. */ > --- gdb-10.2/gdb/ui-file.h.orig > +++ gdb-10.2/gdb/ui-file.h > @@ -195,6 +195,7 @@ public: > @@ -16139,58 +16110,4 @@ exit 0 > + (dynamic_cast<stdio_file *>gdb_stderr)->set_stream(original_stderr_stream); > } > > - /* > ---- gdb-10.2/gdb/thread.c.orig > -+++ gdb-10.2/gdb/thread.c > -@@ -60,7 +60,7 @@ static thread_info *current_thread_; > - > - #ifdef CRASH_MERGE > - /* Function to set cpu, defined by crash-utility */ > --extern "C" void set_cpu (int); > -+extern "C" void set_cpu(int cpu, int print_context); > - #endif > - > - /* RAII type used to increase / decrease the refcount of each thread > -@@ -1098,6 +1098,9 @@ print_thread_info_1 (struct ui_out *uiout, const char *requested_threads, > - uiout->table_body (); > - } > - > -+#ifdef CRASH_MERGE > -+ int current_cpu = current_thread ? current_thread->ptid.tid() : 0; > -+#endif > - for (inferior *inf : all_inferiors ()) > - for (thread_info *tp : inf->threads ()) > - { > -@@ -1113,6 +1116,11 @@ print_thread_info_1 (struct ui_out *uiout, const char *requested_threads, > - > - ui_out_emit_tuple tuple_emitter (uiout, NULL); > - > -+ /* Switch to the thread's CPU in crash */ > -+#ifdef CRASH_MERGE > -+ set_cpu(tp->ptid.tid(), FALSE); > -+#endif > -+ > - if (!uiout->is_mi_like_p ()) > - { > - if (tp == current_thread) > -@@ -1185,6 +1193,11 @@ print_thread_info_1 (struct ui_out *uiout, const char *requested_threads, > - /* This end scope restores the current thread and the frame > - selected before the "info threads" command, and it finishes the > - ui-out list or table. */ > -+ > -+#ifdef CRASH_MERGE > -+ /* Restore crash's context */ > -+ set_cpu(current_cpu, FALSE); > -+#endif > - } > - > - if (pid == -1 && requested_threads == NULL) > -@@ -1904,7 +1917,7 @@ thread_command (const char *tidstr, int from_tty) > - struct thread_info* thread_id = parse_thread_id (tidstr, NULL); > - > - #ifdef CRASH_MERGE > -- set_cpu(thread_id->ptid.tid()); > -+ set_cpu(thread_id->ptid.tid(), FALSE); > - #endif > - > - thread_select (tidstr, thread_id); > + /* > \ No newline at end of file > diff --git a/kernel.c b/kernel.c > index c7de73a..49d422b 100644 > --- a/kernel.c > +++ b/kernel.c > @@ -6543,6 +6543,28 @@ set_cpu(int cpu, int print_context) > show_context(CURRENT_CONTEXT()); > } > > +int > +crash_set_thread(ulong task) > +{ > + bool found = FALSE; > + struct task_context *tc = FIRST_CONTEXT(); > + > + for (int i = 0; i < RUNNING_TASKS(); i++, tc++) { > + if (tc->task == task) { > + found = TRUE; > + break; > + } > + } > + > + if (!found) > + return -1; > + > + if (CURRENT_TASK() == tc->task) > + return tc->processor; > + > + set_context(tc->task, NO_PID, TRUE); > + return tc->processor; > +} > > /* > * Collect the irq_desc[] entry along with its associated handler and > diff --git a/task.c b/task.c > index a405b05..ef79f53 100644 > --- a/task.c > +++ b/task.c > @@ -715,7 +715,8 @@ task_init(void) > * crash_target::fetch_registers, so CPU 0's registers are shown as > * <unavailable> in gdb mode > * */ > - gdb_refresh_regcache(0); > + for (int i = 0; i < get_cpus_online(); i++) > + gdb_refresh_regcache(i); Does x86 require all regcache(s) to be refreshed at init time ? Till v4 of my ppc series, I also was doing refresh of all CPUs, but it's better to do it when required, and there was a review in v4, patch 5 from Lianbo, quoting it here: > In addition, I haven't tested how far it takes the time in the > multi-core environment, is it possible to implement a smaller > granularity refresh operation? For example: When we need switch to > another cpu, execute the refresh operation for corresponding regcache. CPU 0's refresh is required due to gdb having already tried to initialised registers before we initialised crash_target This might add a small noticeable delay on vmcores from big systems. Adds around 4.6 seconds for a ppc64 vmcore with 64 CPUs, in crash init time, while reading it on a fast 48 cpu Power8 system. This overhead is less for systems with less CPUs, eg. ~0.43 seconds, on a vmcore of 8 CPUs x86_64 system. Thanks, Aditya Gupta > > tt->flags |= TASK_INIT_DONE; > } > @@ -5315,7 +5316,7 @@ set_context(ulong task, ulong pid, uint update_gdb_thread) > > /* change the selected thread in gdb, according to current context */ > if (update_gdb_thread) > - return gdb_change_cpu_context(tc->processor); > + return gdb_change_thread_context(tc->task); > else > return TRUE; > } else { > -- > 2.40.1 > -- Crash-utility mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxxxxxx https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/ Contribution Guidelines: https://github.com/crash-utility/crash/wiki