On 2024/03/26 15:44, lijiang wrote: > Thanks for the comment, Kazu. > > On Tue, Mar 26, 2024 at 10:28 AM HAGIO KAZUHITO(萩尾 一仁) > <k-hagio-ab@xxxxxxx <mailto:k-hagio-ab@xxxxxxx>> wrote: > > Hi Lianbo, > > thanks for the patch. > > What is the kernel version of this vmcore? > > > The kernel version is 5.14.0, but I did not reproduce it, it seems it's > not easy to reproduce. I see, thanks. If it's a RHEL kernel, please let me know the release number e.g. 5.14.0-362.8.1.el9_3.x86_64 ? > > and could I have "bt 0 -c 8 | tail -n 30" output? > > crash> bt 0 -c 8 | tail -n 30 oh my bad, lack of "bt -r" option... how about "bt 0 -c 8 -r | tail -n 30" ? Thanks, Kazu > #4 [fffffe1788788ef0] end_repeat_nmi at ffffffff980015f9 > [exception RIP: __update_load_avg_se+13] > RIP: ffffffff9736b16d RSP: ffffbec3c08acc78 RFLAGS: 00000046 > RAX: 0000000000000000 RBX: ffff994c2f2b1a40 RCX: ffffbec3c08acdc0 > RDX: ffff9948e4fe1d80 RSI: ffff994c2f2b1a40 RDI: 0000001d7ad7d55d > RBP: ffffbec3c08acc88 R8: 0000001d921fca6f R9: ffff994c2f2b1328 > R10: 00000000fffd0010 R11: ffffffff98e060c0 R12: 0000001d7ad7d55d > R13: 0000000000000005 R14: ffff994c2f2b19c0 R15: 0000000000000001 > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 > --- <NMI exception stack> --- > #5 [ffffbec3c08acc78] __update_load_avg_se at ffffffff9736b16d > #6 [ffffbec3c08acce0] enqueue_entity at ffffffff9735c9ab > #7 [ffffbec3c08acd28] enqueue_task_fair at ffffffff9735cef8 > #8 [ffffbec3c08acd60] enqueue_task at ffffffff973481fa > #9 [ffffbec3c08acd88] ttwu_do_activate at ffffffff9734aeed > #10 [ffffbec3c08acdb0] try_to_wake_up at ffffffff9734c7d7 > #11 [ffffbec3c08ace08] __queue_work at ffffffff9732a4d2 > #12 [ffffbec3c08ace50] queue_work_on at ffffffff9732a6a4 > #13 [ffffbec3c08ace60] iomap_dio_bio_end_io at ffffffff976a7b4c > #14 [ffffbec3c08ace90] clone_endio at ffffffffc090315f [dm_mod] > #15 [ffffbec3c08aced0] blk_update_request at ffffffff9779b49d > #16 [ffffbec3c08acf28] scsi_end_request at ffffffff97a3d5a7 > #17 [ffffbec3c08acf58] scsi_io_completion at ffffffff97a3e606 > #18 [ffffbec3c08acf90] blk_complete_reqs at ffffffff977978d0 > #19 [ffffbec3c08acfa0] __do_softirq at ffffffff97e66f7a > #20 [ffffbec3c08acff0] do_softirq at ffffffff9730f6ef > --- <IRQ stack> --- > #21 [ffffbec3c022ff28] cpu_startup_entry at ffffffff973684a9 > #22 [ffffbec3c022ff38] start_secondary at ffffffff9726a3dd > #23 [ffffbec3c022ff50] secondary_startup_64_no_verify at ffffffff9720015a > crash> > > If it's RHEL9, probably that do_softirq is called with this path. > > cpu_startup_entry > do_idle > flush_smp_call_function_queue > do_softirq > > but do_idle is skipped as below, I'd like to check just in case.. > > Good question. I noticed the call trace, but this may be another issue. > Thanks > Lianbo > > > #20 [ffffbec3c08acff0] do_softirq at ffffffff9730f6ef > > --- <IRQ stack> --- > > #21 [ffffbec3c022ff28] cpu_startup_entry at ffffffff973684a9 > > Thanks, > Kazu > > > On 2024/03/19 16:59, Lianbo Jiang wrote: > > The "bogus exception frame" warning was observed again on a specific > > vmcore, and the remaining frame was truncated on X86_64 machine, when > > executing the "bt" command as below: > > > > crash> bt 0 -c 8 > > PID: 0 TASK: ffff9948c08f5640 CPU: 8 COMMAND: > "swapper/8" > > #0 [fffffe1788788e58] crash_nmi_callback at ffffffff972672bb > > #1 [fffffe1788788e68] nmi_handle at ffffffff9722eb8e > > #2 [fffffe1788788eb0] default_do_nmi at ffffffff97e51cd0 > > #3 [fffffe1788788ed0] exc_nmi at ffffffff97e51ee1 > > #4 [fffffe1788788ef0] end_repeat_nmi at ffffffff980015f9 > > [exception RIP: __update_load_avg_se+13] > > RIP: ffffffff9736b16d RSP: ffffbec3c08acc78 RFLAGS: 00000046 > > RAX: 0000000000000000 RBX: ffff994c2f2b1a40 RCX: > ffffbec3c08acdc0 > > RDX: ffff9948e4fe1d80 RSI: ffff994c2f2b1a40 RDI: > 0000001d7ad7d55d > > RBP: ffffbec3c08acc88 R8: 0000001d921fca6f R9: > ffff994c2f2b1328 > > R10: 00000000fffd0010 R11: ffffffff98e060c0 R12: > 0000001d7ad7d55d > > R13: 0000000000000005 R14: ffff994c2f2b19c0 R15: > 0000000000000001 > > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 > > --- <NMI exception stack> --- > > #5 [ffffbec3c08acc78] __update_load_avg_se at ffffffff9736b16d > > #6 [ffffbec3c08acce0] enqueue_entity at ffffffff9735c9ab > > #7 [ffffbec3c08acd28] enqueue_task_fair at ffffffff9735cef8 > > #8 [ffffbec3c08acd60] enqueue_task at ffffffff973481fa > > #9 [ffffbec3c08acd88] ttwu_do_activate at ffffffff9734aeed > > #10 [ffffbec3c08acdb0] try_to_wake_up at ffffffff9734c7d7 > > #11 [ffffbec3c08ace08] __queue_work at ffffffff9732a4d2 > > #12 [ffffbec3c08ace50] queue_work_on at ffffffff9732a6a4 > > #13 [ffffbec3c08ace60] iomap_dio_bio_end_io at ffffffff976a7b4c > > #14 [ffffbec3c08ace90] clone_endio at ffffffffc090315f [dm_mod] > > #15 [ffffbec3c08aced0] blk_update_request at ffffffff9779b49d > > #16 [ffffbec3c08acf28] scsi_end_request at ffffffff97a3d5a7 > > #17 [ffffbec3c08acf58] scsi_io_completion at ffffffff97a3e606 > > #18 [ffffbec3c08acf90] blk_complete_reqs at ffffffff977978d0 > > #19 [ffffbec3c08acfa0] __do_softirq at ffffffff97e66f7a > > #20 [ffffbec3c08acff0] do_softirq at ffffffff9730f6ef > > --- <IRQ stack> --- > > #21 [ffffbec3c022ff18] do_idle at ffffffff97368288 > > [exception RIP: unknown or invalid address] > > RIP: 0000000000000000 RSP: 0000000000000000 RFLAGS: 00000000 > > RAX: 0000000000000000 RBX: 000000089726a2d0 RCX: > 0000000000000000 > > RDX: 0000000000000000 RSI: 0000000000000000 RDI: > 0000000000000000 > > RBP: ffffffff9726a3dd R8: 0000000000000000 R9: > 0000000000000000 > > R10: ffffffff9720015a R11: e48885e126bc1600 R12: > 0000000000000000 > > R13: ffffffff973684a9 R14: 0000000000000094 R15: > 0000000040000000 > > ORIG_RAX: 0000000000000000 CS: 0000 SS: 0000 > > bt: WARNING: possibly bogus exception frame > > crash> > > > > Actually there is no exception frame, when called from do_softirq(). > > With the patch: > > > > crash> bt 0 -c 8 > > PID: 0 TASK: ffff9948c08f5640 CPU: 8 COMMAND: > "swapper/8" > > #0 [fffffe1788788e58] crash_nmi_callback at ffffffff972672bb > > #1 [fffffe1788788e68] nmi_handle at ffffffff9722eb8e > > #2 [fffffe1788788eb0] default_do_nmi at ffffffff97e51cd0 > > #3 [fffffe1788788ed0] exc_nmi at ffffffff97e51ee1 > > #4 [fffffe1788788ef0] end_repeat_nmi at ffffffff980015f9 > > [exception RIP: __update_load_avg_se+13] > > RIP: ffffffff9736b16d RSP: ffffbec3c08acc78 RFLAGS: 00000046 > > RAX: 0000000000000000 RBX: ffff994c2f2b1a40 RCX: > ffffbec3c08acdc0 > > RDX: ffff9948e4fe1d80 RSI: ffff994c2f2b1a40 RDI: > 0000001d7ad7d55d > > RBP: ffffbec3c08acc88 R8: 0000001d921fca6f R9: > ffff994c2f2b1328 > > R10: 00000000fffd0010 R11: ffffffff98e060c0 R12: > 0000001d7ad7d55d > > R13: 0000000000000005 R14: ffff994c2f2b19c0 R15: > 0000000000000001 > > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 > > --- <NMI exception stack> --- > > #5 [ffffbec3c08acc78] __update_load_avg_se at ffffffff9736b16d > > #6 [ffffbec3c08acce0] enqueue_entity at ffffffff9735c9ab > > #7 [ffffbec3c08acd28] enqueue_task_fair at ffffffff9735cef8 > > #8 [ffffbec3c08acd60] enqueue_task at ffffffff973481fa > > #9 [ffffbec3c08acd88] ttwu_do_activate at ffffffff9734aeed > > #10 [ffffbec3c08acdb0] try_to_wake_up at ffffffff9734c7d7 > > #11 [ffffbec3c08ace08] __queue_work at ffffffff9732a4d2 > > #12 [ffffbec3c08ace50] queue_work_on at ffffffff9732a6a4 > > #13 [ffffbec3c08ace60] iomap_dio_bio_end_io at ffffffff976a7b4c > > #14 [ffffbec3c08ace90] clone_endio at ffffffffc090315f [dm_mod] > > #15 [ffffbec3c08aced0] blk_update_request at ffffffff9779b49d > > #16 [ffffbec3c08acf28] scsi_end_request at ffffffff97a3d5a7 > > #17 [ffffbec3c08acf58] scsi_io_completion at ffffffff97a3e606 > > #18 [ffffbec3c08acf90] blk_complete_reqs at ffffffff977978d0 > > #19 [ffffbec3c08acfa0] __do_softirq at ffffffff97e66f7a > > #20 [ffffbec3c08acff0] do_softirq at ffffffff9730f6ef > > --- <IRQ stack> --- > > #21 [ffffbec3c022ff28] cpu_startup_entry at ffffffff973684a9 > > #22 [ffffbec3c022ff38] start_secondary at ffffffff9726a3dd > > #23 [ffffbec3c022ff50] secondary_startup_64_no_verify at > ffffffff9720015a > > crash> > > > > Reported-by: Jie Li <jieli@xxxxxxxxxx <mailto:jieli@xxxxxxxxxx>> > > Signed-off-by: Lianbo Jiang <lijiang@xxxxxxxxxx > <mailto:lijiang@xxxxxxxxxx>> > > --- > > x86_64.c | 7 ++++--- > > 1 file changed, 4 insertions(+), 3 deletions(-) > > > > diff --git a/x86_64.c b/x86_64.c > > index 502817d3b2bd..c672a0c3e8fc 100644 > > --- a/x86_64.c > > +++ b/x86_64.c > > @@ -3841,11 +3841,12 @@ in_exception_stack: > > up -= 1; > > bt->instptr = *up; > > /* > > - * No exception frame when coming from > do_softirq_own_stack > > - * or call_softirq. > > + * No exception frame when coming from > do_softirq_own_stack, > > + * call_softirq or do_softirq. > > */ > > if ((sp = value_search(bt->instptr, &offset)) && > > - (STREQ(sp->name, "do_softirq_own_stack") || > STREQ(sp->name, "call_softirq"))) > > + (STREQ(sp->name, "do_softirq_own_stack") || > STREQ(sp->name, "call_softirq") > > + || STREQ(sp->name, "do_softirq"))) > > irq_eframe = 0; > > bt->frameptr = 0; > > done = FALSE; > -- Crash-utility mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxxxxxx https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/ Contribution Guidelines: https://github.com/crash-utility/crash/wiki