[Crash-utility] Re: [[PATCH v2]] Clean up on top of one-thread-v2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi folks, appologize for late responce due to other errands.

I performed testing of one-thread-v2 from Tao's github repo.
And  performed several cleanups steps there.
Please see my PR here: https://github.com/liutgnu/crash-dev/pull/1

What is the latest repo/branch I can also run my tests against?
Is it: https://github.com/liutgnu/crash-dev/tree/tao-rebase-v3

Regards,
--Alexey

On 4/4/24 5:38 PM, Tao Liu wrote:
Hi Aditya,

On Thu, Apr 4, 2024 at 10:02 PM Aditya Gupta <adityag@xxxxxxxxxxxxx> wrote:

Hi Tao & Alexey,

On Mon, Apr 01, 2024 at 05:14:41PM +0800, Tao Liu wrote:
Hi Aditya & Alexey,

On Mon, Apr 1, 2024 at 4:39 PM Aditya Gupta <adityag@xxxxxxxxxxxxx> wrote:

Hi Tao & Alexey,

Please have a test or review, any comments would be nice.

[1]: https://github.com/liutgnu/crash-dev/tree/tao-rebase-v2
[2]: https://github.com/adi-g15-ibm/crash/tree/stack-unwind-one-thread-merged

Works nicely for me. nitpick in commit 'Let crash change gdb context':


Thanks for the comment, I get it updated. Please see v3 at:

https://github.com/liutgnu/crash-dev/tree/tao-rebase-v3

Works for me. Should we post the base patch series ?

I have no objections for posting the base patch series, however since
I have merged & modified Alexey's patch. As I mentioned in the
previous emails, there are 3 places modified. I'd like to know
Alexey's comments for these modifications, I'm not sure if they are
appropriate.

Hi Alexey, do you have any commnets for the 3 modifications? If no
then we can post the patches for upstream review.

Thanks,
Tao Liu


Thanks,
Aditya Gupta


Thanks,
Tao Liu

Instead of:

```
@@ -6068,6 +6068,7 @@ void sort_tgid_array(void);
  int sort_by_tgid(const void *, const void *);
  int in_irq_ctx(ulonglong, int, ulong);
  void check_stack_overflow(void);
+int gdb_change_thread_context (ulong);

  /*
   *  extensions.c
@@ -8050,4 +8051,7 @@ enum x86_64_regnum {
          LAST_REGNUM
  };

+/* crash_target.c */
+extern int gdb_change_cpu_context (unsigned int cpu);
+
  #endif /* !GDB_COMMON */
```

We can directly have:

```
@@ -8050,4 +8051,7 @@ enum x86_64_regnum {
          LAST_REGNUM
  };

+/* crash_target.c */
+extern int gdb_change_thread_context (ulong task);
+
  #endif /* !GDB_COMMON */
```

Since 'gdb_change_cpu_context' is no more defined in any patch, and
gdb_change_thread_context is defined in crash_target.c

Thanks

- Aditya Gupta


Thanks,
Tao Liu

On Wed, Mar 27, 2024 at 4:05 AM Alexey Makhalov
<alexey.makhalov@xxxxxxxxxxxx> wrote:

Hi Tao,

I need to see if there is any existing code in crash which is doing it.
I would like to avoid wheel reinvention.

For just stack unwinding, IP/SP is enough. But to see function arguments and local variables
especially lower frames, gdb needs to know other GPRs.

inactive_task_frame does not use pt_regs layout for saved registers and from the latest Linux sources
__switch_to_asm() does not provide dwarf/debug annotations regarding where the registers
were saved. That's why I ended up with an inactive_task_frame parsing implementation.

About other kernel versions, I need to see how it has evolved over time. AFAIR, at some point
in the past it was pt_regs based.

Thanks,
--Alexey


On Tue, Mar 26, 2024 at 2:14 AM Tao Liu <ltao@xxxxxxxxxx> wrote:

Hi Alexey,

For the rest of the patch, I didn't see any problem. But since it only
supports modern kernels, could you please implement the rest, so I can
have a testing for variety versions of kernels(ranging from 2.6.x to
5.x), in order to find if there are further problems.

Thanks,
Tao Liu

On Mon, Mar 25, 2024 at 11:34 AM Tao Liu <ltao@xxxxxxxxxx> wrote:

Hi Alexey,

On Fri, Mar 22, 2024 at 7:20 AM Alexey Makhalov
<alexey.makhalov@xxxxxxxxxxxx> wrote:

Hi folks, please take a look on v2 of this patch and give it a try.
IMHO, Only remaining piece is to clean up x86_64_get_current_task_reg
around getting pt_regs.

Things done:
1. Rename machdep->get_cpu_reg to machdep->get_current_task_reg and rename
    x86_64_get_cpu_reg to x86_64_get_current_task_reg, plus corresponding changes
    for arm64 and ppc64.
2. Untied gdb's TPID from any tasks or CPUs ID. Always use CURRENT_CONTEXT
    for regs and properties fetching.
3. When CURRENT switched, gdb_change_thread_context() must be called to
    invalidate caches.
4. Update x86_64_get_current_task_reg to support inactive tasks for offline
    debugging.
5. Fixed frame selection issue. It was caused by multiple calls of set_context()
    which calls gdb_change_thread_context() unnecessarily dropping gdb caches.

TODO:
1. x86_64_get_current_task_reg is questionable about fetching pt_regs for
    inactive tasks. How do you gyus get pt_regs structure on modern linux kernels?
    PS: I added my approach. More testing required.

Known issues
1. After switching from all tasks to one thread:
    b. unwinding started to work differently around top (garbaged) frames - Not quite
    an issue for me.

Signed-off-by: Alexey Makhalov <alexey.makhalov@xxxxxxxxxxxx>
---
  arm64.c         |  4 ++--
  crash_target.c  | 48 ++++++++++++++++++-------------------
  defs.h          | 10 +++-----
  gdb_interface.c |  8 +++----
  kernel.c        | 23 ------------------
  ppc64.c         |  8 +++----
  task.c          | 16 ++++---------
  x86_64.c        | 64 ++++++++++++++++++++++++++++++++++++++++++++-----
  8 files changed, 99 insertions(+), 82 deletions(-)

diff --git a/arm64.c b/arm64.c
index 8ff10bf..340db2d 100644
--- a/arm64.c
+++ b/arm64.c
@@ -153,7 +153,7 @@ static void arm64_calc_kernel_start(void)
  }

  static int
-arm64_get_cpu_reg(int cpu, int regno, const char *name,
+arm64_get_current_task_reg(int regno, const char *name,
                     int size, void *value)
  {
         struct bt_info bt_info, bt_setup;
@@ -511,7 +511,7 @@ arm64_init(int when)
                 machdep->dumpfile_init = NULL;
                 machdep->verify_line_number = NULL;
                 machdep->init_kernel_pgd = arm64_init_kernel_pgd;
-               machdep->get_cpu_reg = arm64_get_cpu_reg;
+               machdep->get_current_task_reg = arm64_get_current_task_reg;

                 /* use machdep parameters */
                 arm64_calc_phys_offset();
diff --git a/crash_target.c b/crash_target.c
index fb3b634..6e96506 100644
--- a/crash_target.c
+++ b/crash_target.c
@@ -26,11 +26,8 @@
  void crash_target_init (void);

  extern "C" int gdb_readmem_callback(unsigned long, void *, int, int);
-extern "C" int crash_get_cpu_reg (int cpu, int regno, const char *regname,
+extern "C" int crash_get_current_task_reg (int regno, const char *regname,
                                    int regsize, void *val);
-extern "C" void gdb_refresh_regcache(unsigned int cpu);
-extern "C" int set_cpu(int cpu, int print_context);
-extern "C" int crash_set_thread(ulong);
  extern "C" int gdb_change_thread_context (ulong);
  extern "C" void crash_get_current_task_info(unsigned long *pid, char **comm);

@@ -72,22 +69,27 @@ public:

  };

-/* We just get all the registers, so we don't use regno.  */
  void
  crash_target::fetch_registers (struct regcache *regcache, int regno)
  {
+  int r;
    gdb_byte regval[16];
-  int cpu = inferior_ptid.tid();
    struct gdbarch *arch = regcache->arch ();

-  for (int r = 0; r < gdbarch_num_regs (arch); r++)
+  if (regno >= 0) {
+    r = regno;
+    goto onetime;
+  }
+
+  for (r = 0; regno == -1 && r < gdbarch_num_regs (arch); r++)
      {
+onetime:
        const char *regname = gdbarch_register_name(arch, r);
        int regsize = register_size (arch, r);
        if (regsize > sizeof (regval))
          error (_("fatal error: buffer size is not enough to fit register value"));

-      if (crash_get_cpu_reg (cpu, r, regname, regsize, (void *)&regval))
+      if (crash_get_current_task_reg (r, regname, regsize, (void *)&regval))
          regcache->raw_supply (r, regval);
        else
          regcache->raw_supply (r, NULL);
@@ -142,19 +144,17 @@ crash_target_init (void)
  extern "C" int
  gdb_change_thread_context (ulong task)
  {
-  inferior* inf = current_inferior();
-  int cpu = crash_set_thread(task);
-  if (cpu < 0)
-    return FALSE;
-
-  ptid_t ptid = ptid_t(CRASH_INFERIOR_PID, 0, cpu);
-   thread_info *tp = find_thread_ptid (inf, ptid);
-
-   if (tp == nullptr)
-     return FALSE;
-
-   target_fetch_registers(get_thread_regcache(tp), -1);
-   switch_to_thread(tp);
-   reinit_frame_cache ();
-   return TRUE;
-}
\ No newline at end of file
+  static ulong prev_task = 0;
+
+  /*
+   * Crash calls this method even if CURRENT task is not updated.
+   * Ignore it and keep gdb caches active.
+   */
+  if (task == prev_task)
+         return TRUE;
+  prev_task = task;
+
+  registers_changed();
+  reinit_frame_cache();
+  return TRUE;
+}
diff --git a/defs.h b/defs.h
index d7e3b18..7596179 100644
--- a/defs.h
+++ b/defs.h
@@ -1081,7 +1081,7 @@ struct machdep_table {
          void (*get_irq_affinity)(int);
          void (*show_interrupts)(int, ulong *);
         int (*is_page_ptr)(ulong, physaddr_t *);
-       int (*get_cpu_reg)(int, int, const char *, int, void *);
+       int (*get_current_task_reg)(int, const char *, int, void *);
         int (*is_cpu_prstatus_valid)(int cpu);
  };

@@ -6124,7 +6124,6 @@ void dump_kernel_table(int);
  void dump_bt_info(struct bt_info *, char *where);
  void dump_log(int);
  void parse_kernel_version(char *);
-int crash_set_thread(ulong);

  #define LOG_LEVEL(v) ((v) & 0x07)
  #define SHOW_LOG_LEVEL    (0x1)
@@ -8023,7 +8022,7 @@ extern int have_full_symbols(void);

  /*
   * Register numbers must be in sync with gdb/features/i386/64bit-core.c
- * to make crash_target->fetch_registers() ---> machdep->get_cpu_reg()
+ * to make crash_target->fetch_registers() ---> machdep->get_current_task_reg()
   * working properly.
   */
  enum x86_64_regnum {
@@ -8081,7 +8080,7 @@ enum arm64_regnum {

  /*
   * Register numbers to make crash_target->fetch_registers()
- * ---> machdep->get_cpu_reg() work properly.
+ * ---> machdep->get_current_task_reg() work properly.
   *
   *  These register numbers and names are given according to output of
   *  `rs6000_register_name`, because that is what was being used by
@@ -8199,7 +8198,4 @@ enum ppc64_regnum {
         PPC64_VRSAVE_REGNU = 139
  };

-/* crash_target.c */
-extern void gdb_refresh_regcache (unsigned int cpu);
-
  #endif /* !GDB_COMMON */
diff --git a/gdb_interface.c b/gdb_interface.c
index 28264c2..2e9117a 100644
--- a/gdb_interface.c
+++ b/gdb_interface.c
@@ -1073,14 +1073,14 @@ unsigned long crash_get_kaslr_offset(void)
  }

  /* Callbacks for crash_target */
-int crash_get_cpu_reg (int cpu, int regno, const char *regname,
+int crash_get_current_task_reg (int regno, const char *regname,
                         int regsize, void *val);

-int crash_get_cpu_reg (int cpu, int regno, const char *regname,
+int crash_get_current_task_reg (int regno, const char *regname,
                         int regsize, void *value)
  {
-        if (!machdep->get_cpu_reg)
+        if (!machdep->get_current_task_reg)
                  return FALSE;
-        return machdep->get_cpu_reg(cpu, regno, regname, regsize, value);
+        return machdep->get_current_task_reg(regno, regname, regsize, value);
  }

diff --git a/kernel.c b/kernel.c
index 446fd78..15b00dc 100644
--- a/kernel.c
+++ b/kernel.c
@@ -6543,29 +6543,6 @@ set_cpu(int cpu, int print_context)
                 show_context(CURRENT_CONTEXT());
  }

-int
-crash_set_thread(ulong task)
-{
-       bool found = FALSE;
-       struct task_context *tc = FIRST_CONTEXT();
-
-       for (int i = 0; i < RUNNING_TASKS(); i++, tc++) {
-               if (tc->task == task) {
-                       found = TRUE;
-                       break;
-               }
-       }
-
-       if (!found)
-               return -1;
-
-       if (CURRENT_TASK() == tc->task)
-               return 0;
-
-       set_context(tc->task, NO_PID, TRUE);
-       return 0;
-}
-
  /*
   *  Collect the irq_desc[] entry along with its associated handler and
   *  action structures.
diff --git a/ppc64.c b/ppc64.c
index a87e621..aeccb70 100644
--- a/ppc64.c
+++ b/ppc64.c
@@ -55,7 +55,7 @@ static void ppc64_set_bt_emergency_stack(enum emergency_stack_type type,
  static char * ppc64_check_eframe(struct ppc64_pt_regs *);
  static void ppc64_print_eframe(char *, struct ppc64_pt_regs *,
                 struct bt_info *);
-static int ppc64_get_cpu_reg(int cpu, int regno, const char *name, int size,
+static int ppc64_get_current_task_reg(int regno, const char *name, int size,
                   void *value);
  static void parse_cmdline_args(void);
  static int ppc64_paca_percpu_offset_init(int);
@@ -706,7 +706,7 @@ ppc64_init(int when)
                                 error(FATAL, "cannot malloc hwirqstack buffer space.");
                 }

-               machdep->get_cpu_reg = ppc64_get_cpu_reg;
+               machdep->get_current_task_reg = ppc64_get_current_task_reg;

                 ppc64_init_paca_info();

@@ -2506,7 +2506,7 @@ ppc64_print_eframe(char *efrm_str, struct ppc64_pt_regs *regs,
  }

  static int
-ppc64_get_cpu_reg(int cpu, int regno, const char *name, int size,
+ppc64_get_current_task_reg(int regno, const char *name, int size,
                   void *value)
  {
         struct bt_info bt_info, bt_setup;
@@ -2555,7 +2555,7 @@ ppc64_get_cpu_reg(int cpu, int regno, const char *name, int size,
         pt_regs = (struct ppc64_pt_regs *)bt_info.machdep;

         if (!pt_regs) {
-               error(WARNING, "pt_regs not available for cpu %d\n", cpu);
+               error(WARNING, "pt_regs not available for cpu %d\n", tc->processor);
                 return FALSE;
         }

diff --git a/task.c b/task.c
index f7e5b05..5f1ddad 100644
--- a/task.c
+++ b/task.c
@@ -11286,19 +11286,11 @@ check_stack_end_magic:
                 fprintf(fp, "No stack overflows detected\n");
  }

+void crash_get_current_task_info(unsigned long *pid, char **comm);
  void crash_get_current_task_info(unsigned long *pid, char **comm)
  {
-       int i;
-       struct task_context *tc;
+       struct task_context *tc = CURRENT_CONTEXT();

-       tc = FIRST_CONTEXT();
-       for (i = 0; i < RUNNING_TASKS(); i++, tc++)
-               if (tc->task == CURRENT_TASK()) {
-                       *pid = tc->pid;
-                       *comm = tc->comm;
-                       return;
-               }
-       *pid = 0;
-       *comm = NULL;
-       return;
+       *pid = tc->pid;
+       *comm = tc->comm;
  }
diff --git a/x86_64.c b/x86_64.c
index 41e67c6..73bb353 100644
--- a/x86_64.c
+++ b/x86_64.c
@@ -126,7 +126,7 @@ static int x86_64_get_framesize(struct bt_info *, ulong, ulong, char *);
  static void x86_64_framesize_debug(struct bt_info *);
  static void x86_64_get_active_set(void);
  static int x86_64_get_kvaddr_ranges(struct vaddr_range *);
-static int x86_64_get_cpu_reg(int, int, const char *, int, void *);
+static int x86_64_get_current_task_reg(int, const char *, int, void *);
  static int x86_64_verify_paddr(uint64_t);
  static void GART_init(void);
  static void x86_64_exception_stacks_init(void);
@@ -207,7 +207,7 @@ x86_64_init(int when)
                 machdep->machspec->irq_eframe_link = UNINITIALIZED;
                 machdep->machspec->irq_stack_gap = UNINITIALIZED;
                 machdep->get_kvaddr_ranges = x86_64_get_kvaddr_ranges;
-               machdep->get_cpu_reg = x86_64_get_cpu_reg;
+               machdep->get_current_task_reg = x86_64_get_current_task_reg;
                  if (machdep->cmdline_args[0])
                          parse_cmdline_args();
                 if ((string = pc->read_vmcoreinfo("relocate"))) {
@@ -898,7 +898,7 @@ x86_64_dump_machdep_table(ulong arg)
          fprintf(fp, "        is_page_ptr: x86_64_is_page_ptr()\n");
          fprintf(fp, "       verify_paddr: x86_64_verify_paddr()\n");
          fprintf(fp, "  get_kvaddr_ranges: x86_64_get_kvaddr_ranges()\n");
-       fprintf(fp, "        get_cpu_reg: x86_64_get_cpu_reg()\n");
+       fprintf(fp, "get_current_task_reg: x86_64_get_current_task_reg()\n");
          fprintf(fp, "    init_kernel_pgd: x86_64_init_kernel_pgd()\n");
          fprintf(fp, "clear_machdep_cache: x86_64_clear_machdep_cache()\n");
         fprintf(fp, " xendump_p2m_create: %s\n", PVOPS_XEN() ?
@@ -9186,7 +9186,7 @@ x86_64_get_kvaddr_ranges(struct vaddr_range *vrp)
                 }

  static int
-x86_64_get_cpu_reg(int cpu, int regno, const char *name,
+x86_64_get_current_task_reg(int regno, const char *name,
                     int size, void *value)
  {
         struct bt_info bt_info, bt_setup;
@@ -9195,8 +9195,6 @@ x86_64_get_cpu_reg(int cpu, int regno, const char *name,
         ulong ip, sp;
         bool ret = FALSE;

-       if (VMSS_DUMPFILE())
-               return vmware_vmss_get_cpu_reg(cpu, regno, name, size, value);
         switch (regno) {
         case RAX_REGNUM ... GS_REGNUM:
         case FS_BASE_REGNUM ... ORIG_RAX_REGNUM:
@@ -9208,6 +9206,60 @@ x86_64_get_cpu_reg(int cpu, int regno, const char *name,
         tc = CURRENT_CONTEXT();
         if (!tc)
                 return FALSE;
+
+       /*
+        * For inactive task, grab rip, rbp, rbx, r12, r13, r14 and r15 from
+        * inactive_task_frame (see __switch_to_asm). Other regs saved on
+        * regular frame.
+        */
+       if (!is_task_active(tc->task)) {
+               int frame_size = STRUCT_SIZE("inactive_task_frame");
+
+               /* Only modern kernels supported. */
+               if (tt->flags & THREAD_INFO && frame_size == 7 * 8) {
+                       ulong rsp;
+                       int offset = 0;
+                       switch (regno) {
+                               case RSP_REGNUM:
+                                       readmem(tc->task + OFFSET(task_struct_thread) +
+                                               OFFSET(thread_struct_rsp), KVADDR,
+                                               &rsp, sizeof(void *),
+                                               "thread_struct rsp", FAULT_ON_ERROR);
+                                       rsp += frame_size;
+                                       memcpy(value, &rsp, size);
+                                       return TRUE;
+                               case RIP_REGNUM:
+                                       offset += 8;
+                               case RBP_REGNUM:
+                                       offset += 8;
+                               case RBX_REGNUM:
+                                       offset += 8;
+                               case R12_REGNUM:
+                                       offset += 8;
+                               case R13_REGNUM:
+                                       offset += 8;
+                               case R14_REGNUM:
+                                       offset += 8;
+                               case R15_REGNUM:
+                                       readmem(tc->task + OFFSET(task_struct_thread) +
+                                               OFFSET(thread_struct_rsp), KVADDR,
+                                               &rsp, sizeof(void *),
+                                               "thread_struct rsp", FAULT_ON_ERROR);
+                                       readmem(rsp + offset, KVADDR, value, sizeof(void *),
+                                                       "inactive_thread_frame saved regs", FAULT_ON_ERROR);
+                                       return TRUE;
+                       }

For this part, it looks similar to what
x86_64.c:x86_64_get_stack_frame():5016 is trying to do, which is
extracting reg values from stack. Is there a special consideration to
put it in here?

IMHO, the x86_64_get_stack_frame() is responsible for extracting reg
values from stack frame or elf notes(aka the regs value reproducer).
And it will give a universal data structure, aka pt_regs, to
x86_64_get_current_task_reg() (aka the regs value consumer).

In this design, the x86_64_get_current_task_reg() is decoupled from
stack, no matter inactive_task_frame or traditional linked list stack
frame or active tasks regs coming from elf notes, the consumer won't
care, all it need is a pt_regs structure and get values from it.
Personally I prefer this design. Any comments?

For other parts of the patch, I'm still reviewing and testing.

Thanks,
Tao Liu

+               }
+               /* TBD: older kernels support. */
+               return FALSE;
+       }
+
+       /*
+        * Task is active, grab CPU's registers
+        */
+       if (VMSS_DUMPFILE())
+               return vmware_vmss_get_cpu_reg(tc->processor, regno, name, size, value);
+
         BZERO(&bt_setup, sizeof(struct bt_info));
         clone_bt_info(&bt_setup, &bt_info, tc);
         fill_stackbuf(&bt_info);
--
2.39.0







--
Crash-utility mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxxxxxx
https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/
Contribution Guidelines: https://github.com/crash-utility/crash/wiki




[Index of Archives]     [Fedora Development]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [KDE Users]     [Fedora Tools]

 

Powered by Linux