On 2024/03/26 17:25, lijiang wrote: > On Tue, Mar 26, 2024 at 2:59 PM HAGIO KAZUHITO(萩尾 一仁) > <k-hagio-ab@xxxxxxx <mailto:k-hagio-ab@xxxxxxx>> wrote: > > On 2024/03/26 15:44, lijiang wrote: > > Thanks for the comment, Kazu. > > > > On Tue, Mar 26, 2024 at 10:28 AM HAGIO KAZUHITO(萩尾 一仁) > > <k-hagio-ab@xxxxxxx <mailto:k-hagio-ab@xxxxxxx> > <mailto:k-hagio-ab@xxxxxxx <mailto:k-hagio-ab@xxxxxxx>>> wrote: > > > > Hi Lianbo, > > > > thanks for the patch. > > > > What is the kernel version of this vmcore? > > > > > > The kernel version is 5.14.0, but I did not reproduce it, it > seems it's > > not easy to reproduce. > > I see, thanks. > > If it's a RHEL kernel, please let me know the release number e.g. > 5.14.0-362.8.1.el9_3.x86_64 ? > > > Not 8.1, it's the 5.14.0-362.2.1.el9_3.x86_64. > > > > > and could I have "bt 0 -c 8 | tail -n 30" output? > > > > crash> bt 0 -c 8 | tail -n 30 > > oh my bad, lack of "bt -r" option... > how about "bt 0 -c 8 -r | tail -n 30" ? > > crash> bt 0 -c 8 -r | tail -n 30 > ffffbec3c022fe20: 0000000000000000 0000000000000000 > ffffbec3c022fe30: ffff9948c08f6278 pick_next_task+82 > ffffbec3c022fe40: ffffbec3c022fea0 0000000000000000 > ffffbec3c022fe50: 0000000000000000 __switch_to_asm+58 > ffffbec3c022fe60: finish_task_switch+140 0000000000000000 > ffffbec3c022fe70: ffff9948c08f5640 ffff9948e6f03980 > ffffbec3c022fe80: 0000000000000000 tick_nohz_next_event+90 > ffffbec3c022fe90: ffff994c2f2a2ae0 0000000000000000 > ffffbec3c022fea0: 0000000000000000 0000000000000008 > ffffbec3c022feb0: ct_kernel_enter.constprop.0+64 0000000000000046 > ffffbec3c022fec0: read_tsc ktime_get+56 > ffffbec3c022fed0: 0000000000000000 __flush_smp_call_function_queue+206 > ffffbec3c022fee0: 0000000000000286 ffff9948c08f5640 > ffffbec3c022fef0: 0000000000000046 0000000000000286 > ffffbec3c022ff00: flush_smp_call_function_queue+72 0000000000000008 > ffffbec3c022ff10: do_idle+168 0000000040000000 > ffffbec3c022ff20: 0000000000000094 cpu_startup_entry+25 > ffffbec3c022ff30: 0000000000000000 start_secondary+269 > ffffbec3c022ff40: 000000089726a2d0 e48885e126bc1600 > ffffbec3c022ff50: secondary_startup_64_no_verify+229 0000000000000000 > ffffbec3c022ff60: 0000000000000000 0000000000000000 > ffffbec3c022ff70: 0000000000000000 0000000000000000 > ffffbec3c022ff80: 0000000000000000 0000000000000000 > ffffbec3c022ff90: 0000000000000000 0000000000000000 > ffffbec3c022ffa0: 0000000000000000 0000000000000000 > ffffbec3c022ffb0: 0000000000000000 0000000000000000 > ffffbec3c022ffc0: 0000000000000000 0000000000000000 > ffffbec3c022ffd0: 0000000000000000 0000000000000000 > ffffbec3c022ffe0: 0000000000000000 0000000000000000 > ffffbec3c022fff0: 0000000000000000 0000000000000000 > crash> > > Thanks, > Kazu > > > #4 [fffffe1788788ef0] end_repeat_nmi at ffffffff980015f9 > > [exception RIP: __update_load_avg_se+13] > > RIP: ffffffff9736b16d RSP: ffffbec3c08acc78 RFLAGS: 00000046 > > RAX: 0000000000000000 RBX: ffff994c2f2b1a40 RCX: > ffffbec3c08acdc0 > > RDX: ffff9948e4fe1d80 RSI: ffff994c2f2b1a40 RDI: > 0000001d7ad7d55d > > RBP: ffffbec3c08acc88 R8: 0000001d921fca6f R9: > ffff994c2f2b1328 > > R10: 00000000fffd0010 R11: ffffffff98e060c0 R12: > 0000001d7ad7d55d > > R13: 0000000000000005 R14: ffff994c2f2b19c0 R15: > 0000000000000001 > > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 > > --- <NMI exception stack> --- > > #5 [ffffbec3c08acc78] __update_load_avg_se at ffffffff9736b16d > > #6 [ffffbec3c08acce0] enqueue_entity at ffffffff9735c9ab > > #7 [ffffbec3c08acd28] enqueue_task_fair at ffffffff9735cef8 > > #8 [ffffbec3c08acd60] enqueue_task at ffffffff973481fa > > #9 [ffffbec3c08acd88] ttwu_do_activate at ffffffff9734aeed > > #10 [ffffbec3c08acdb0] try_to_wake_up at ffffffff9734c7d7 > > #11 [ffffbec3c08ace08] __queue_work at ffffffff9732a4d2 > > #12 [ffffbec3c08ace50] queue_work_on at ffffffff9732a6a4 > > #13 [ffffbec3c08ace60] iomap_dio_bio_end_io at ffffffff976a7b4c > > #14 [ffffbec3c08ace90] clone_endio at ffffffffc090315f [dm_mod] > > #15 [ffffbec3c08aced0] blk_update_request at ffffffff9779b49d > > #16 [ffffbec3c08acf28] scsi_end_request at ffffffff97a3d5a7 > > #17 [ffffbec3c08acf58] scsi_io_completion at ffffffff97a3e606 > > #18 [ffffbec3c08acf90] blk_complete_reqs at ffffffff977978d0 > > #19 [ffffbec3c08acfa0] __do_softirq at ffffffff97e66f7a > > #20 [ffffbec3c08acff0] do_softirq at ffffffff9730f6ef > > --- <IRQ stack> --- > > #21 [ffffbec3c022ff28] cpu_startup_entry at ffffffff973684a9 > > #22 [ffffbec3c022ff38] start_secondary at ffffffff9726a3dd > > #23 [ffffbec3c022ff50] secondary_startup_64_no_verify at > ffffffff9720015a > > crash> > > > > If it's RHEL9, probably that do_softirq is called with this path. > > > > cpu_startup_entry > > do_idle > > flush_smp_call_function_queue > > do_softirq > > > > but do_idle is skipped as below, I'd like to check just in case.. > > > > Good question. I noticed the call trace, but this may be another > issue. Thank you for the bt -r information. Yes, it looks like they are skipped probably due to x86_64_irq_eframe_link, but I don't have a good idea for now. Let's fix this first. I've moved "do_softirq" first to be checked and applied. https://github.com/crash-utility/crash/commit/ce47cb8dabb56c88e2d753026a9fdc83f83a5f5d Thanks, Kazu > > Thanks > > Lianbo > > > > > #20 [ffffbec3c08acff0] do_softirq at ffffffff9730f6ef > > > --- <IRQ stack> --- > > > #21 [ffffbec3c022ff28] cpu_startup_entry at > ffffffff973684a9 > > > > Thanks, > > Kazu > > > > > > On 2024/03/19 16:59, Lianbo Jiang wrote: > > > The "bogus exception frame" warning was observed again on > a specific > > > vmcore, and the remaining frame was truncated on X86_64 > machine, when > > > executing the "bt" command as below: > > > > > > crash> bt 0 -c 8 > > > PID: 0 TASK: ffff9948c08f5640 CPU: 8 COMMAND: > > "swapper/8" > > > #0 [fffffe1788788e58] crash_nmi_callback at > ffffffff972672bb > > > #1 [fffffe1788788e68] nmi_handle at ffffffff9722eb8e > > > #2 [fffffe1788788eb0] default_do_nmi at ffffffff97e51cd0 > > > #3 [fffffe1788788ed0] exc_nmi at ffffffff97e51ee1 > > > #4 [fffffe1788788ef0] end_repeat_nmi at ffffffff980015f9 > > > [exception RIP: __update_load_avg_se+13] > > > RIP: ffffffff9736b16d RSP: ffffbec3c08acc78 > RFLAGS: 00000046 > > > RAX: 0000000000000000 RBX: ffff994c2f2b1a40 RCX: > > ffffbec3c08acdc0 > > > RDX: ffff9948e4fe1d80 RSI: ffff994c2f2b1a40 RDI: > > 0000001d7ad7d55d > > > RBP: ffffbec3c08acc88 R8: 0000001d921fca6f R9: > > ffff994c2f2b1328 > > > R10: 00000000fffd0010 R11: ffffffff98e060c0 R12: > > 0000001d7ad7d55d > > > R13: 0000000000000005 R14: ffff994c2f2b19c0 R15: > > 0000000000000001 > > > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 > > > --- <NMI exception stack> --- > > > #5 [ffffbec3c08acc78] __update_load_avg_se at > ffffffff9736b16d > > > #6 [ffffbec3c08acce0] enqueue_entity at ffffffff9735c9ab > > > #7 [ffffbec3c08acd28] enqueue_task_fair at > ffffffff9735cef8 > > > #8 [ffffbec3c08acd60] enqueue_task at ffffffff973481fa > > > #9 [ffffbec3c08acd88] ttwu_do_activate at ffffffff9734aeed > > > #10 [ffffbec3c08acdb0] try_to_wake_up at ffffffff9734c7d7 > > > #11 [ffffbec3c08ace08] __queue_work at ffffffff9732a4d2 > > > #12 [ffffbec3c08ace50] queue_work_on at ffffffff9732a6a4 > > > #13 [ffffbec3c08ace60] iomap_dio_bio_end_io at > ffffffff976a7b4c > > > #14 [ffffbec3c08ace90] clone_endio at ffffffffc090315f > [dm_mod] > > > #15 [ffffbec3c08aced0] blk_update_request at > ffffffff9779b49d > > > #16 [ffffbec3c08acf28] scsi_end_request at ffffffff97a3d5a7 > > > #17 [ffffbec3c08acf58] scsi_io_completion at > ffffffff97a3e606 > > > #18 [ffffbec3c08acf90] blk_complete_reqs at > ffffffff977978d0 > > > #19 [ffffbec3c08acfa0] __do_softirq at ffffffff97e66f7a > > > #20 [ffffbec3c08acff0] do_softirq at ffffffff9730f6ef > > > --- <IRQ stack> --- > > > #21 [ffffbec3c022ff18] do_idle at ffffffff97368288 > > > [exception RIP: unknown or invalid address] > > > RIP: 0000000000000000 RSP: 0000000000000000 > RFLAGS: 00000000 > > > RAX: 0000000000000000 RBX: 000000089726a2d0 RCX: > > 0000000000000000 > > > RDX: 0000000000000000 RSI: 0000000000000000 RDI: > > 0000000000000000 > > > RBP: ffffffff9726a3dd R8: 0000000000000000 R9: > > 0000000000000000 > > > R10: ffffffff9720015a R11: e48885e126bc1600 R12: > > 0000000000000000 > > > R13: ffffffff973684a9 R14: 0000000000000094 R15: > > 0000000040000000 > > > ORIG_RAX: 0000000000000000 CS: 0000 SS: 0000 > > > bt: WARNING: possibly bogus exception frame > > > crash> > > > > > > Actually there is no exception frame, when called from > do_softirq(). > > > With the patch: > > > > > > crash> bt 0 -c 8 > > > PID: 0 TASK: ffff9948c08f5640 CPU: 8 COMMAND: > > "swapper/8" > > > #0 [fffffe1788788e58] crash_nmi_callback at > ffffffff972672bb > > > #1 [fffffe1788788e68] nmi_handle at ffffffff9722eb8e > > > #2 [fffffe1788788eb0] default_do_nmi at ffffffff97e51cd0 > > > #3 [fffffe1788788ed0] exc_nmi at ffffffff97e51ee1 > > > #4 [fffffe1788788ef0] end_repeat_nmi at ffffffff980015f9 > > > [exception RIP: __update_load_avg_se+13] > > > RIP: ffffffff9736b16d RSP: ffffbec3c08acc78 > RFLAGS: 00000046 > > > RAX: 0000000000000000 RBX: ffff994c2f2b1a40 RCX: > > ffffbec3c08acdc0 > > > RDX: ffff9948e4fe1d80 RSI: ffff994c2f2b1a40 RDI: > > 0000001d7ad7d55d > > > RBP: ffffbec3c08acc88 R8: 0000001d921fca6f R9: > > ffff994c2f2b1328 > > > R10: 00000000fffd0010 R11: ffffffff98e060c0 R12: > > 0000001d7ad7d55d > > > R13: 0000000000000005 R14: ffff994c2f2b19c0 R15: > > 0000000000000001 > > > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 > > > --- <NMI exception stack> --- > > > #5 [ffffbec3c08acc78] __update_load_avg_se at > ffffffff9736b16d > > > #6 [ffffbec3c08acce0] enqueue_entity at ffffffff9735c9ab > > > #7 [ffffbec3c08acd28] enqueue_task_fair at > ffffffff9735cef8 > > > #8 [ffffbec3c08acd60] enqueue_task at ffffffff973481fa > > > #9 [ffffbec3c08acd88] ttwu_do_activate at ffffffff9734aeed > > > #10 [ffffbec3c08acdb0] try_to_wake_up at ffffffff9734c7d7 > > > #11 [ffffbec3c08ace08] __queue_work at ffffffff9732a4d2 > > > #12 [ffffbec3c08ace50] queue_work_on at ffffffff9732a6a4 > > > #13 [ffffbec3c08ace60] iomap_dio_bio_end_io at > ffffffff976a7b4c > > > #14 [ffffbec3c08ace90] clone_endio at ffffffffc090315f > [dm_mod] > > > #15 [ffffbec3c08aced0] blk_update_request at > ffffffff9779b49d > > > #16 [ffffbec3c08acf28] scsi_end_request at ffffffff97a3d5a7 > > > #17 [ffffbec3c08acf58] scsi_io_completion at > ffffffff97a3e606 > > > #18 [ffffbec3c08acf90] blk_complete_reqs at > ffffffff977978d0 > > > #19 [ffffbec3c08acfa0] __do_softirq at ffffffff97e66f7a > > > #20 [ffffbec3c08acff0] do_softirq at ffffffff9730f6ef > > > --- <IRQ stack> --- > > > #21 [ffffbec3c022ff28] cpu_startup_entry at > ffffffff973684a9 > > > #22 [ffffbec3c022ff38] start_secondary at ffffffff9726a3dd > > > #23 [ffffbec3c022ff50] secondary_startup_64_no_verify at > > ffffffff9720015a > > > crash> > > > > > > Reported-by: Jie Li <jieli@xxxxxxxxxx > <mailto:jieli@xxxxxxxxxx> <mailto:jieli@xxxxxxxxxx > <mailto:jieli@xxxxxxxxxx>>> > > > Signed-off-by: Lianbo Jiang <lijiang@xxxxxxxxxx > <mailto:lijiang@xxxxxxxxxx> > > <mailto:lijiang@xxxxxxxxxx <mailto:lijiang@xxxxxxxxxx>>> > > > --- > > > x86_64.c | 7 ++++--- > > > 1 file changed, 4 insertions(+), 3 deletions(-) > > > > > > diff --git a/x86_64.c b/x86_64.c > > > index 502817d3b2bd..c672a0c3e8fc 100644 > > > --- a/x86_64.c > > > +++ b/x86_64.c > > > @@ -3841,11 +3841,12 @@ in_exception_stack: > > > up -= 1; > > > bt->instptr = *up; > > > /* > > > - * No exception frame when coming from > > do_softirq_own_stack > > > - * or call_softirq. > > > + * No exception frame when coming from > > do_softirq_own_stack, > > > + * call_softirq or do_softirq. > > > */ > > > if ((sp = value_search(bt->instptr, &offset)) && > > > - (STREQ(sp->name, "do_softirq_own_stack") || > > STREQ(sp->name, "call_softirq"))) > > > + (STREQ(sp->name, "do_softirq_own_stack") || > > STREQ(sp->name, "call_softirq") > > > + || STREQ(sp->name, "do_softirq"))) > > > irq_eframe = 0; > > > bt->frameptr = 0; > > > done = FALSE; > > > -- Crash-utility mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxxxxxx https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/ Contribution Guidelines: https://github.com/crash-utility/crash/wiki