Hi Aditya, On Wed, Mar 13, 2024 at 5:33 PM Aditya Gupta <adityag@xxxxxxxxxxxxx> wrote: > > Hi Tao, > On Wed, Mar 13, 2024 at 11:23:56AM +0800, Tao Liu wrote: > > Hi Aditya, > > > > On Tue, Mar 12, 2024 at 5:12 PM Aditya Gupta <adityag@xxxxxxxxxxxxx> wrote: > > > > > > Hi Tao, > > > > > > > <...> > > > > > > > > +crash_target *target = NULL; > > > > + > > > > void > > > > crash_target_init (void) > > > > { > > > > int nr_cpus = crash_get_nr_cpus(); > > > > - crash_target *target = new crash_target (); > > > > + target = new crash_target (); > > > > > > > > /* Own the target until it is successfully pushed. */ > > > > target_ops_up target_holder (target); > > > > @@ -137,29 +139,35 @@ crash_target_init (void) > > > > reinit_frame_cache (); > > > > } > > > > > > > > -/* > > > > - * Change gdb's thread context to the thread on given CPU > > > > - **/ > > > > extern "C" int > > > > -gdb_change_cpu_context(unsigned int cpu) > > > > +gdb_change_thread_context (ulong task) > > > > { > > > > - ptid_t ptid = ptid_t(CRASH_INFERIOR_PID, 0, cpu); > > > > - inferior *inf = current_inferior (); > > > > - thread_info *tp = find_thread_ptid (inf, ptid); > > > > - > > > > - if (tp == nullptr) > > > > + int tried = 0; > > > > + inferior* inf = current_inferior(); > > > > + int cpu = crash_set_thread(task); > > > > + if (cpu < 0) > > > > return FALSE; > > > > > > > > - /* Making sure that crash's context is same */ > > > > - set_cpu(cpu, FALSE); > > > > - > > > > - /* Switch to the thread */ > > > > - switch_to_thread(tp); > > > > - > > > > - /* Fetch/Refresh thread's registers */ > > > > - gdb_refresh_regcache(cpu); > > > > + ptid_t ptid = ptid_t(CRASH_INFERIOR_PID, 0, cpu); > > > > > > > > - return TRUE; > > > > +retry: > > > > + thread_info *tp = find_thread_ptid (inf, ptid); > > > > + if (tp == nullptr && !tried) { > > > > + thread_info *thread = add_thread_silent(target, > > > > + ptid_t(CRASH_INFERIOR_PID, 0, cpu)); > > > > > > A minor comment. > > > Is it possible to do it without needing 'target' as global ? I see > > > there's 'inf->current_top_target()', but it's return type if target_ops, > > > don't know if it can be used. > > > > > Thanks for your comments, however I guess it is not workable for the > > code change. What we want is to add a thread to the target. I found > > one function "target.c:new_thread", which can be used as > > "new_thread(current_inferior(), ptid_t(CRASH_INFERIOR_PID, 0, cpu));" > > to replace the "add_thread_silent()" code, except we need to make > > "new_thread" to be non-static. > > > > I prefer to keep the code as current, because I don't see the side > > effect of making target as global variable. What do you think? > > Then there doesn't seem to be an alternative, then I guess we will have > to keep the global variable, seems okay to me. > > > > > > > <...> > > > > > > > > /* > > > > * Collect the irq_desc[] entry along with its associated handler and > > > > diff --git a/task.c b/task.c > > > > index a405b05..ef79f53 100644 > > > > --- a/task.c > > > > +++ b/task.c > > > > @@ -715,7 +715,8 @@ task_init(void) > > > > * crash_target::fetch_registers, so CPU 0's registers are shown as > > > > * <unavailable> in gdb mode > > > > * */ > > > > - gdb_refresh_regcache(0); > > > > + for (int i = 0; i < get_cpus_online(); i++) > > > > + gdb_refresh_regcache(i); > > > > > > Does x86 require all regcache(s) to be refreshed at init time ? > > > > > > Till v4 of my ppc series, I also was doing refresh of all CPUs, but it's > > > better to do it when required, and there was a review in v4, patch 5 > > > from Lianbo, quoting it here: > > > > > > > In addition, I haven't tested how far it takes the time in the > > > > multi-core environment, is it possible to implement a smaller > > > > granularity refresh operation? For example: When we need switch to > > > > another cpu, execute the refresh operation for corresponding regcache. > > > > > > CPU 0's refresh is required due to gdb having already tried to > > > initialised registers before we initialised crash_target > > > > > > This might add a small noticeable delay on vmcores from big systems. > > > > > > Adds around 4.6 seconds for a ppc64 vmcore with 64 CPUs, in crash init > > > time, while reading it on a fast 48 cpu Power8 system. > > > This overhead is less for systems with less CPUs, eg. ~0.43 seconds, on > > > a vmcore of 8 CPUs x86_64 system. > > > > Yes, thanks for the testing results. There do have performance > > downgrades of refreshing all cpu caches. But this will give correct > > results when user type "info threads" after showing "crash>": > > > > With all cpu cache refreshed: > > crash> info threads > > Id Target Id Frame > > 1 CPU 0 native_safe_halt () at > > arch/x86/include/asm/irqflags.h:54 > > 2 CPU 1 native_safe_halt () at > > arch/x86/include/asm/irqflags.h:54 > > 3 CPU 2 <unavailable> in ?? () > > 4 CPU 3 <unavailable> in ?? () > > 5 CPU 4 native_safe_halt () at > > arch/x86/include/asm/irqflags.h:54 > > 6 CPU 5 <unavailable> in ?? () > > 7 CPU 6 <unavailable> in ?? () > > * 8 CPU 7 blk_mq_rq_timed_out (req=0xffff880fdb246000, > > reserved=reserved@entry=false) at block/blk-mq.c:640 > > 9 CPU 8 native_safe_halt () at > > arch/x86/include/asm/irqflags.h:54 > > 10 CPU 9 <unavailable> in ?? () > > 11 CPU 10 <unavailable> in ?? () > > 12 CPU 11 native_safe_halt () at > > arch/x86/include/asm/irqflags.h:54 > > 13 CPU 12 <unavailable> in ?? () > > 14 CPU 13 native_safe_halt () at > > arch/x86/include/asm/irqflags.h:54 > > 15 CPU 14 <unavailable> in ?? () > > 16 CPU 15 <unavailable> in ?? () > > 17 CPU 16 <unavailable> in ?? () > > 18 CPU 17 native_safe_halt () at > > arch/x86/include/asm/irqflags.h:54 > > 19 CPU 18 native_safe_halt () at > > arch/x86/include/asm/irqflags.h:54 > > 20 CPU 19 <unavailable> in ?? () > > 21 CPU 20 <unavailable> in ?? () > > 22 CPU 21 <unavailable> in ?? () > > 23 CPU 22 native_safe_halt () at > > arch/x86/include/asm/irqflags.h:54 > > 24 CPU 23 <unavailable> in ?? () > > crash> q > > > > With only cpu0 refreshed: > > crash> info threads > > Id Target Id Frame > > 1 CPU 0 native_safe_halt () at > > arch/x86/include/asm/irqflags.h:54 > > 2 CPU 1 blk_mq_rq_timed_out (req=0xffff880fdb246000, > > reserved=reserved@entry=false) at block/blk-mq.c:640 > > 3 CPU 2 blk_mq_rq_timed_out (req=0xffff880fdb246000, > > reserved=reserved@entry=false) at block/blk-mq.c:640 > > 4 CPU 3 blk_mq_rq_timed_out (req=0xffff880fdb246000, > > reserved=reserved@entry=false) at block/blk-mq.c:640 > > 5 CPU 4 blk_mq_rq_timed_out (req=0xffff880fdb246000, > > reserved=reserved@entry=false) at block/blk-mq.c:640 > > 6 CPU 5 blk_mq_rq_timed_out (req=0xffff880fdb246000, > > reserved=reserved@entry=false) at block/blk-mq.c:640 > > 7 CPU 6 blk_mq_rq_timed_out (req=0xffff880fdb246000, > > reserved=reserved@entry=false) at block/blk-mq.c:640 > > * 8 CPU 7 blk_mq_rq_timed_out (req=0xffff880fdb246000, > > reserved=reserved@entry=false) at block/blk-mq.c:640 > > 9 CPU 8 blk_mq_rq_timed_out (req=0xffff880fdb246000, > > reserved=reserved@entry=false) at block/blk-mq.c:640 > > 10 CPU 9 blk_mq_rq_timed_out (req=0xffff880fdb246000, > > reserved=reserved@entry=false) at block/blk-mq.c:640 > > 11 CPU 10 blk_mq_rq_timed_out (req=0xffff880fdb246000, > > reserved=reserved@entry=false) at block/blk-mq.c:640 > > 12 CPU 11 blk_mq_rq_timed_out (req=0xffff880fdb246000, > > reserved=reserved@entry=false) at block/blk-mq.c:640 > > 13 CPU 12 blk_mq_rq_timed_out (req=0xffff880fdb246000, > > reserved=reserved@entry=false) at block/blk-mq.c:640 > > 14 CPU 13 blk_mq_rq_timed_out (req=0xffff880fdb246000, > > reserved=reserved@entry=false) at block/blk-mq.c:640 > > 15 CPU 14 blk_mq_rq_timed_out (req=0xffff880fdb246000, > > reserved=reserved@entry=false) at block/blk-mq.c:640 > > 16 CPU 15 blk_mq_rq_timed_out (req=0xffff880fdb246000, > > reserved=reserved@entry=false) at block/blk-mq.c:640 > > 17 CPU 16 blk_mq_rq_timed_out (req=0xffff880fdb246000, > > reserved=reserved@entry=false) at block/blk-mq.c:640 > > 18 CPU 17 blk_mq_rq_timed_out (req=0xffff880fdb246000, > > reserved=reserved@entry=false) at block/blk-mq.c:640 > > 19 CPU 18 blk_mq_rq_timed_out (req=0xffff880fdb246000, > > reserved=reserved@entry=false) at block/blk-mq.c:640 > > 20 CPU 19 blk_mq_rq_timed_out (req=0xffff880fdb246000, > > reserved=reserved@entry=false) at block/blk-mq.c:640 > > 21 CPU 20 blk_mq_rq_timed_out (req=0xffff880fdb246000, > > reserved=reserved@entry=false) at block/blk-mq.c:640 > > 22 CPU 21 blk_mq_rq_timed_out (req=0xffff880fdb246000, > > reserved=reserved@entry=false) at block/blk-mq.c:640 > > 23 CPU 22 blk_mq_rq_timed_out (req=0xffff880fdb246000, > > reserved=reserved@entry=false) at block/blk-mq.c:640 > > 24 CPU 23 blk_mq_rq_timed_out (req=0xffff880fdb246000, > > reserved=reserved@entry=false) at block/blk-mq.c:640 > > > > As we can see, other cpus[1-6, 8-23] just take the reg cache of > > cpu[7], which is incorrect. And if users go further like, "thread 20" > > and "gdb bt", it will also give incorrect stack traces. > > > > The cpu cache will only get refreshed once user type "set <pid>", so > > the cpu cache will be refreshed by the <pid> task's context. > > > > I doubt a user will understand all the details and constraints, so I'm > > afraid the user will be confused by the faulty output. But I also have > > no objection if the performance is the priority. Basically it is a > > balance of pays and gains. In addition, since cmd "info" and "thread" > > is a command provided by gdb, currently I don't know how to hack > > those, so cpu cache can be refreshed when "info threads" or "thread > > <num>" have been invoked. > > > > Do you have any thoughts? > > I also had faced that issue initially, ie. the other CPUs using up same > regcache, if all are not refreshed. > While iterating through all threads, gdb switches it's context > temporarily, while crash's context remained same, thus causing gdb to > get same registers for all threads other than 0. > > This was solved in patch #3 (synchronise cpu context changes between crash/gdb) > in the ppc's 'Improve stack unwind on ppc64' series, by syncing gdb's > context with crash. > > Can this change in thread.c in gdb-10.2.patch in patch #2 be reverted ? > That will fix it. Could you share your patch, based on your v10 and my v1 patch series, so I can get a clue how to do this? I tried but was unsuccessful. Since I have changed your #3 patch a bit in my v1 patch series, such as gdb_change_cpu_context() -> gdb_change_thread_context(), I doubt that's the reason for failing. What I did is keeping "set_cpu" in thread.c:thread_command() as the gdb-10.2.patch describes in your #3 patch. But only one thread gets refreshed when I invoke "thread X", and no regcache refreshed when invoke "info threads". Thanks, Tao Liu > > ``` > ---- gdb-10.2/gdb/thread.c.orig > -+++ gdb-10.2/gdb/thread.c > -@@ -60,7 +60,7 @@ static thread_info *current_thread_; > - > - #ifdef CRASH_MERGE > - /* Function to set cpu, defined by crash-utility */ > --extern "C" void set_cpu (int); > -+extern "C" void set_cpu(int cpu, int print_context); > - #endif > - > - /* RAII type used to increase / decrease the refcount of each thread > -@@ -1098,6 +1098,9 @@ print_thread_info_1 (struct ui_out *uiout, const char *requested_threads, > - uiout->table_body (); > - } > - > -+#ifdef CRASH_MERGE > -+ int current_cpu = current_thread ? current_thread->ptid.tid() : 0; > -+#endif > - for (inferior *inf : all_inferiors ()) > - for (thread_info *tp : inf->threads ()) > - { > -@@ -1113,6 +1116,11 @@ print_thread_info_1 (struct ui_out *uiout, const char *requested_threads, > - > - ui_out_emit_tuple tuple_emitter (uiout, NULL); > - > -+ /* Switch to the thread's CPU in crash */ > -+#ifdef CRASH_MERGE > -+ set_cpu(tp->ptid.tid(), FALSE); > -+#endif > -+ > ``` > > 'info threads' keeps changing the gdb context temporarily, while it > prints each thread's frame and other information. > Correspondingly this changes ensured that crash's context also syncs > according to change in gdb's context. > > Then, when user does 'info threads', it will refresh all regcache since > it has to go through each thread. > > In that case that issue of the same frame being printed for all threads > should not happen. > > That should remove the need to refresh all regcache at init time, and > leave it to later when user needs it. Refreshing all regcache at init > time is the sure shot way that all regcache are correct, but I guess > everyone will anyways do a 'info threads', but we can delay the overhead > to later. What do you think ? > > Thanks, > Aditya Gupta > > > > > Thanks, > > Tao Liu > > > > > > > > > > > > Thanks, > > > Aditya Gupta > > > > > > > > > > > tt->flags |= TASK_INIT_DONE; > > > > } > > > > @@ -5315,7 +5316,7 @@ set_context(ulong task, ulong pid, uint update_gdb_thread) > > > > > > > > /* change the selected thread in gdb, according to current context */ > > > > if (update_gdb_thread) > > > > - return gdb_change_cpu_context(tc->processor); > > > > + return gdb_change_thread_context(tc->task); > > > > else > > > > return TRUE; > > > > } else { > > > > -- > > > > 2.40.1 > > > > > > > > > > -- Crash-utility mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxxxxxx https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/ Contribution Guidelines: https://github.com/crash-utility/crash/wiki