Re: [PATCH] Fix for "bt" command incorrectly printing eframe stack with a bogus warning

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2023/02/16 13:05, lijiang wrote:
>>> On Thu, Feb 16, 2023 at 8:48 AM HAGIO KAZUHITO(萩尾 一仁) <k-hagio-ab@xxxxxxx <mailto:k-hagio-ab@xxxxxxx>> wrote:
>>>
>>>     On 2023/02/15 17:24, Lianbo Jiang wrote:
>>>     > Kernel commit 7d65f4a65532 ("irq: Consolidate do_softirq() arch overriden
>>>     > implementations") renamed the call_softirq to do_softirq_own_stack.
>>>     > Crash may incorrectly output the eframe stack with a warning as below:
>>>     >
>>>     >    crash> foreach bt
>>>     >    ...
>>>     >    PID: 0        TASK: ffff914f820a8000  CPU: 25   COMMAND: "swapper/25"
>>>     >     #0 [fffffe0000504e48] crash_nmi_callback at ffffffffa665d763
>>>     >     #1 [fffffe0000504e50] nmi_handle at ffffffffa662a423
>>>     >     #2 [fffffe0000504ea8] default_do_nmi at ffffffffa6fe7dc9
>>>     >     #3 [fffffe0000504ec8] do_nmi at ffffffffa662a97f
>>>     >     #4 [fffffe0000504ef0] end_repeat_nmi at ffffffffa70015e8
>>>     >        [exception RIP: clone_endio+172]
>>>     >        RIP: ffffffffc005c1ec  RSP: ffffa1d403d08e98  RFLAGS: 00000246
>>>     >        RAX: 0000000000000000  RBX: ffff915326fba230  RCX: 0000000000000018
>>>     >        RDX: ffffffffc0075400  RSI: 0000000000000000  RDI: ffff915326fba230
>>>     >        RBP: ffff915326fba1c0   R8: 0000000000001000   R9: ffff915308d6d2a0
>>>     >        R10: 000000a97dfe5e10  R11: ffffa1d40038fe98  R12: ffff915302babc40
>>>     >        R13: ffff914f94360000  R14: 0000000000000000  R15: 0000000000000000
>>>     >        ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
>>>     >    --- <NMI exception stack> ---
>>>     >     #5 [ffffa1d403d08e98] clone_endio at ffffffffc005c1ec [dm_mod]
>>>     >     #6 [ffffa1d403d08ed0] blk_update_request at ffffffffa6a96954
>>>     >     #7 [ffffa1d403d08f10] scsi_end_request at ffffffffa6c9b968
>>>     >     #8 [ffffa1d403d08f48] scsi_io_completion at ffffffffa6c9bb3e
>>>     >     #9 [ffffa1d403d08f90] blk_complete_reqs at ffffffffa6aa0e95
>>>     >     #10 [ffffa1d403d08fa0] __softirqentry_text_start at ffffffffa72000dc
>>>     >     #11 [ffffa1d403d08ff0] do_softirq_own_stack at ffffffffa7000f9a
>>>     >    --- <IRQ stack> ---
>>>     >     #12 [ffffa1d40038fe70] do_softirq_own_stack at ffffffffa7000f9a
>>>     >        [exception RIP: unknown or invalid address]
>>>     >        RIP: 0000000000000000  RSP: 0000000000000000  RFLAGS: 00000000
>>>     >        RAX: ffffffffa672eae5  RBX: ffffffffa83b34e0  RCX: ffffffffa672eb12
>>>     >        RDX: 0000000000000010  RSI: 8b7d6c8869010c00  RDI: 0000000000000085
>>>     >        RBP: 0000000000000286   R8: ffff914f820a8000   R9: ffffffffa67a94e0
>>>     >        R10: 0000000000000286  R11: ffffffffa66fb4c5  R12: ffffffffa67a898b
>>>     >        R13: 0000000000000000  R14: fffffffffffffff8  R15: ffffffffa67a1e68
>>>     >        ORIG_RAX: 0000000000000000  CS: 0000  SS: ffffffffa672edff
>>>     >     bt: WARNING: possibly bogus exception frame
>>>     >     #13 [ffffa1d40038ff30] start_secondary at ffffffffa665fa2c
>>>     >     #14 [ffffa1d40038ff50] secondary_startup_64_no_verify at ffffffffa6600116
>>>     >     ...
>>>
>>>     Thank you for the patch.  Could I have "bt -c 25" output with the patch
>>>     and "bt -r -c 25 | tail -n 40" output for checking?
>>>
>>> Sure.

Thank you, looks better.  Applied with Georges' suggestion and a few
commit log changes.
https://github.com/crash-utility/crash/commit/e0e6e4a7ee03b3d00b50a9e4db2f2ea6f7da0da3

The irq_eframe_link calculated on RHEL8 also may be wrong, but
I have a plan of changing that part and will revisit with it.

Thanks,
Kazu

>>>
>>> crash> bt -c 25
>>> PID: 0        TASK: ffff914f820a8000  CPU: 25   COMMAND: "swapper/25"
>>>  #0 [fffffe0000504e48] crash_nmi_callback at ffffffffa665d763
>>>  #1 [fffffe0000504e50] nmi_handle at ffffffffa662a423
>>>  #2 [fffffe0000504ea8] default_do_nmi at ffffffffa6fe7dc9
>>>  #3 [fffffe0000504ec8] do_nmi at ffffffffa662a97f
>>>  #4 [fffffe0000504ef0] end_repeat_nmi at ffffffffa70015e8
>>>     [exception RIP: clone_endio+172]
>>>     RIP: ffffffffc005c1ec  RSP: ffffa1d403d08e98  RFLAGS: 00000246
>>>     RAX: 0000000000000000  RBX: ffff915326fba230  RCX: 0000000000000018
>>>     RDX: ffffffffc0075400  RSI: 0000000000000000  RDI: ffff915326fba230
>>>     RBP: ffff915326fba1c0   R8: 0000000000001000   R9: ffff915308d6d2a0
>>>     R10: 000000a97dfe5e10  R11: ffffa1d40038fe98  R12: ffff915302babc40
>>>     R13: ffff914f94360000  R14: 0000000000000000  R15: 0000000000000000
>>>     ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
>>> --- <NMI exception stack> ---
>>>  #5 [ffffa1d403d08e98] clone_endio at ffffffffc005c1ec [dm_mod]
>>>  #6 [ffffa1d403d08ed0] blk_update_request at ffffffffa6a96954
>>>  #7 [ffffa1d403d08f10] scsi_end_request at ffffffffa6c9b968
>>>  #8 [ffffa1d403d08f48] scsi_io_completion at ffffffffa6c9bb3e
>>>  #9 [ffffa1d403d08f90] blk_complete_reqs at ffffffffa6aa0e95
>>> #10 [ffffa1d403d08fa0] __softirqentry_text_start at ffffffffa72000dc
>>> #11 [ffffa1d403d08ff0] do_softirq_own_stack at ffffffffa7000f9a
>>> --- <IRQ stack> ---
>>> #12 [ffffa1d40038fe70] update_ts_time_stats at ffffffffa67a1e68
>>> #13 [ffffa1d40038fea0] do_softirq at ffffffffa66fb4c5
>>> #14 [ffffa1d40038feb0] flush_smp_call_function_queue at ffffffffa67a94e0
>>> #15 [ffffa1d40038fec0] do_idle at ffffffffa672eae5
>>> #16 [ffffa1d40038ff10] cpu_startup_entry at ffffffffa672edff
>>> #17 [ffffa1d40038ff30] start_secondary at ffffffffa665fa2c
>>> #18 [ffffa1d40038ff50] secondary_startup_64_no_verify at ffffffffa6600116
>>> crash>
>>> crash> bt -r -c 25 | tail -n 40
>>> ffffa1d40038fd80:  ffff9156ff26afc0 0000002027c3e7a1
>>> ffffa1d40038fd90:  __next_timer_interrupt+166 ffff9156ff25aa68
>>> ffffa1d40038fda0:  00000000fffd85da 0000000000000082
>>> ffffa1d40038fdb0:  0000000000000082 hrtimer_get_next_event+78
>>> ffffa1d40038fdc0:  ffff9156ff25aa40 000000200cd15980
>>> ffffa1d40038fdd0:  __hrtimer_next_event_base+192 ffff9156ff25cf40
>>> ffffa1d40038fde0:  7fffffffffffffff ffff9156ff25d580
>>> ffffa1d40038fdf0:  0000000000000082 ffffffffffffffff
>>> ffffa1d40038fe00:  0000000000000082 hrtimer_next_event_without+96
>>> ffffa1d40038fe10:  ffff9156ff25d580 000000200cff2040
>>> ffffa1d40038fe20:  000000200cdb406e ffff9153052f8c00
>>> ffffa1d40038fe30:  sched_clock+5    sched_clock_cpu+12
>>> ffffa1d40038fe40:  acpi_idle_driver+136 ffff9153052f8c00
>>> ffffa1d40038fe50:  cpuidle_enter_state+181 000000203501afdb
>>> ffffa1d40038fe60:  00ffffffa82c44c0 read_tsc
>>> ffffa1d40038fe70:  update_ts_time_stats+88 fffffffffffffff8
>>> ffffa1d40038fe80:  0000000000000000 __flush_smp_call_function_queue+219
>>> ffffa1d40038fe90:  0000000000000286 __cpu_online_mask
>>> ffffa1d40038fea0:  do_softirq+69    0000000000000286
>>> ffffa1d40038feb0:  flush_smp_call_function_queue+96 ffff914f820a8000
>>> ffffa1d40038fec0:  do_idle+405      do_idle+450
>>> ffffa1d40038fed0:  0000000000000010 8b7d6c8869010c00
>>> ffffa1d40038fee0:  0000000000000085 0000000000000000
>>> ffffa1d40038fef0:  0000000000000000 0000000000000000
>>> ffffa1d40038ff00:  0000000000000000 0000000000000000
>>> ffffa1d40038ff10:  cpu_startup_entry+111 8b7d6c8869010c00
>>> ffffa1d40038ff20:  534e004fc6e00600 ffffa1d40038ff38
>>> ffffa1d40038ff30:  start_secondary+396 534e004fc6e00600
>>> ffffa1d40038ff40:  0000000000000000 0000000040000000
>>> ffffa1d40038ff50:  secondary_startup_64_no_verify+209 0000000000000000
>>> ffffa1d40038ff60:  0000000000000000 0000000000000000
>>> ffffa1d40038ff70:  0000000000000000 0000000000000000
>>> ffffa1d40038ff80:  0000000000000000 0000000000000000
>>> ffffa1d40038ff90:  0000000000000000 0000000000000000
>>> ffffa1d40038ffa0:  0000000000000000 0000000000000000
>>> ffffa1d40038ffb0:  0000000000000000 0000000000000000
>>> ffffa1d40038ffc0:  0000000000000000 0000000000000000
>>> ffffa1d40038ffd0:  0000000000000000 0000000000000000
>>> ffffa1d40038ffe0:  0000000000000000 0000000000000000
>>> ffffa1d40038fff0:  0000000000000000 0000000000000000
>>> crash>
>>> Thanks.
>>> Lianbo
>>>
>>>     Thanks,
>>>     Kazu
>>>
>>>     >
>>>     > Also no exception frame when coming from do_softirq_own_stack.
>>>     >
>>>     > Reported-by: Marco Patalano <mpatalan@xxxxxxxxxx <mailto:mpatalan@xxxxxxxxxx>>
>>>     > Signed-off-by: Lianbo Jiang <lijiang@xxxxxxxxxx <mailto:lijiang@xxxxxxxxxx>>
>>>     > ---
>>>     >   x86_64.c | 5 +++--
>>>     >   1 file changed, 3 insertions(+), 2 deletions(-)
>>>     >
>>>     > diff --git a/x86_64.c b/x86_64.c
>>>     > index 5b671bd97775..3428bed417df 100644
>>>     > --- a/x86_64.c
>>>     > +++ b/x86_64.c
>>>     > @@ -3825,10 +3825,11 @@ in_exception_stack:
>>>     >               up -= 1;
>>>     >                   bt->instptr = *up;
>>>     >               /*
>>>     > -              *  No exception frame when coming from call_softirq.
>>>     > +              *  No exception frame when coming from call_softirq
>>>     > +              *  or do_softirq_own_stack.
>>>     >                */
>>>     >               if ((sp = value_search(bt->instptr, &offset)) &&
>>>     > -                 STREQ(sp->name, "call_softirq"))
>>>     > +                 (STREQ(sp->name, "call_softirq") || STREQ(sp->name, "do_softirq_own_stack")))
>>>     >                       irq_eframe = 0;
>>>     >                   bt->frameptr = 0;
>>>     >                   done = FALSE;
>>>
--
Crash-utility mailing list
Crash-utility@xxxxxxxxxx
https://listman.redhat.com/mailman/listinfo/crash-utility
Contribution Guidelines: https://github.com/crash-utility/crash/wiki




[Index of Archives]     [Fedora Development]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [KDE Users]     [Fedora Tools]

 

Powered by Linux