[Crash-utility] Re: [PATCH] x86_64: Add top_of_kernel_stack_padding for kernel stack

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thank you for pointing out this issue, Kazu.

On 6/12/24 10:42 AM, HAGIO KAZUHITO(萩尾 一仁) wrote:
Hi Lianbo, Tao,

On 2024/06/11 18:05, lijiang wrote:

      > > With kernel patch [1], x86_64 will add extra padding for kernel stack,
      > > as a result, the pt_regs will be shift down by the offset of padding.
      > > Without the patch, the values of registers read from pt_regs will be
      > > incorrect.
      > >
      > > Though currently the TOP_OF_KERNEL_STACK_PADDING is configured by
      > > Kconfig, according to kernel code comment [2], the value may be made
      > > dynamicly later. In addition there might be systems compiled without
      > > Kconfig avaliable. So in this patch, we will calculate the value of
      > > TOP_OF_KERNEL_STACK_PADDING.
      > >
      > > The calculation is as follows:
      > >
      > > 1) in startup_64(), there is a lea instruction as:
      > >     leaq (__end_init_task - TOP_OF_KERNEL_STACK_PADDING - PTREGS_SIZE)(%rip), %rsp
      > >
      > > 2) in rewind_stack_and_make_dead(), there is a lea instruction as:
      > >     leaq      -PTREGS_SIZE(%rax), %rsp
      > >
      > > The disassembled 2 instructions will be like:
      > >
      > > 1) 0xffffffff93a0007d <startup_64+3>:      lea    0x1e03ec4(%rip),%rsp        # 0xffffffff95803f48
      > >                                                                                ^^^^^^^^^^^^^^^^^^^^
      > > 2) 0xffffffff93a0465a <rewind_stack_and_make_dead+10>:     lea    -0xa8(%rax),%rsp
      > >                                                                     ^^^^
      > > 0xffffffff95803f48 is the value of (__end_init_task -
      > > TOP_OF_KERNEL_STACK_PADDING - PTREGS_SIZE), and 0xa8 is the value of
      > > PTREGS_SIZE, __end_init_task can be get by symbol reading.
      >
      > Calculating the value of TOP_OF_KERNEL_STACK_PADDING, which looks good, but it heavily relies on compiler.
      > Normally we would use this way unless there is no other choice.
      >
      > How about the following changes? Although it doesn't handle the case that the value is dynamic, let's see
      > how to change in the kernel in future, and then consider how to reflect it in crash-utility.
      >
     Sure, looks good to me, so let's go with this, and update it later
     when kernel changes.


Ok. Thanks, Tao.

Applied with minor changes:

https://github.com/crash-utility/crash/commit/48764a14bc5856f0b0bb30685336c68b832154fc <https://github.com/crash-utility/crash/commit/48764a14bc5856f0b0bb30685336c68b832154fc>
It looks like there is a regression with kernels without CONFIG_X86_FRED.
Could you check?

Can you help to check if this can work for you?

diff --git a/x86_64.c b/x86_64.c
index 6777c93e6b47..469d26b05e24 100644
--- a/x86_64.c
+++ b/x86_64.c
@@ -4086,7 +4086,7 @@ in_exception_stack:

         if (!irq_eframe && !is_kernel_thread(bt->tc->task) &&
             (GET_STACKBASE(bt->tc->task) == bt->stackbase)) {
-               long stack_padding_size = SIZE(fred_frame) > 0 ? (2*8) : 0;
+               long stack_padding_size = VALID_SIZE(fred_frame) ? (2*8) : 0;
                user_mode_eframe = bt->stacktop - SIZE(pt_regs);
                if (last_process_stack_eframe < user_mode_eframe)
                        x86_64_exception_frame(EFRAME_PRINT, 0, bt->stackbuf +
@@ -4408,7 +4408,7 @@ in_exception_stack:

         if (!irq_eframe && !is_kernel_thread(bt->tc->task) &&
             (GET_STACKBASE(bt->tc->task) == bt->stackbase)) {
-               long stack_padding_size = SIZE(fred_frame) > 0 ? (2*8) : 0;diff --git a/x86_64.c b/x86_64.c
index 6777c93e6b47..469d26b05e24 100644
--- a/x86_64.c
+++ b/x86_64.c
@@ -4086,7 +4086,7 @@ in_exception_stack:

         if (!irq_eframe && !is_kernel_thread(bt->tc->task) &&
             (GET_STACKBASE(bt->tc->task) == bt->stackbase)) {
-               long stack_padding_size = SIZE(fred_frame) > 0 ? (2*8) : 0;
+               long stack_padding_size = VALID_SIZE(fred_frame) ? (2*8) : 0;
                user_mode_eframe = bt->stacktop - SIZE(pt_regs);
                if (last_process_stack_eframe < user_mode_eframe)
                        x86_64_exception_frame(EFRAME_PRINT, 0, bt->stackbuf +
@@ -4408,7 +4408,7 @@ in_exception_stack:

         if (!irq_eframe && !is_kernel_thread(bt->tc->task) &&
             (GET_STACKBASE(bt->tc->task) == bt->stackbase)) {
-               long stack_padding_size = SIZE(fred_frame) > 0 ? (2*8) : 0;
+               long stack_padding_size = VALID_SIZE(fred_frame) ? (2*8) : 0;
                user_mode_eframe = bt->stacktop - SIZE(pt_regs);
                if (last_process_stack_eframe < user_mode_eframe)
                        x86_64_exception_frame(EFRAME_PRINT, 0, bt->stackbuf +

+               long stack_padding_size = VALID_SIZE(fred_frame) ? (2*8) : 0;
                user_mode_eframe = bt->stacktop - SIZE(pt_regs);
                if (last_process_stack_eframe < user_mode_eframe)
                        x86_64_exception_frame(EFRAME_PRINT, 0, bt->stackbuf +

I will fix it later with the patch if it works fine from  your side.


Thanks

Lianbo


crash> bt 1

bt: invalid structure size: fred_frame
      FILE: x86_64.c  LINE: 4089  FUNCTION: x86_64_low_budget_back_trace_cmd()

[/home/k-hagio/bin/crash] error trace: 588df3 => 5cbc72 => 5eb3e1 => 5eb366
PID: 1        TASK: ffff9f94c024b980  CPU: 2    COMMAND: "systemd"
   #0 [ffffade44001bca8] __schedule at ffffffffb948ebbb
   #1 [ffffade44001bd10] schedule at ffffffffb948f04d
   #2 [ffffade44001bd20] schedule_hrtimeout_range_clock at ffffffffb9494fef
   #3 [ffffade44001bda8] ep_poll at ffffffffb8c91be8
   #4 [ffffade44001be48] do_epoll_wait at ffffffffb8c91d11
   #5 [ffffade44001be80] __x64_sys_epoll_wait at ffffffffb8c92590
   #6 [ffffade44001bed0] do_syscall_64 at ffffffffb947f459
   #7 [ffffade44001bf50] entry_SYSCALL_64_after_hwframe at ffffffffb96000ea

    5eb366: SIZE_verify.part.42+70
    5eb3e1: SIZE_verify+49
    5cbc72: x86_64_low_budget_back_trace_cmd+3010
    588df3: back_trace+1523

bt: invalid structure size: fred_frame
      FILE: x86_64.c  LINE: 4089  FUNCTION: x86_64_low_budget_back_trace_cmd()

crash> sys
        KERNEL: vmlinux
      DUMPFILE: vmcore
          CPUS: 4
          DATE: Wed Jun 12 02:28:24 JST 2024
        UPTIME: 8 days, 22:17:18
LOAD AVERAGE: 0.03, 0.01, 0.00
         TASKS: 182
      NODENAME: rhel94u
       RELEASE: 5.14.0-427.13.1.el9_4.x86_64
       VERSION: #1 SMP PREEMPT_DYNAMIC Wed Apr 10 10:29:16 EDT 2024
       MACHINE: x86_64  (3408 Mhz)
        MEMORY: 4 GB
         PANIC: "Kernel panic - not syncing: sysrq triggered crash"
Thanks,
Kazu
--
Crash-utility mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxxxxxx
https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/
Contribution Guidelines: https://github.com/crash-utility/crash/wiki




[Index of Archives]     [Fedora Development]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [KDE Users]     [Fedora Tools]

 

Powered by Linux