Hi Lianbo
Please help to review the new patch v7 with only one change for removing the redundant code.
Patch is patch, what's the better way attach it into an email? Copy & paste would not applicable for a large patch file.
I have no vmcore file, but there is a kernel module which would help to trigger an overflow stack panic for testing, please download the module form link [1] and compile it as a module to load it into your test box, please read the README.txt and the source
code for more details.
Best regards
Hong
From: Hong YANG3 杨红 <hong.yang3@xxxxxxx>
Sent: Monday, November 29, 2021 11:40 To: lijiang <lijiang@xxxxxxxxxx>; Discussion list for crash utility usage, maintenance and development <crash-utility@xxxxxxxxxx> Subject: Re: arm64: Support overflow stack panic
Hi Lianbo
I'm using outlook to send mail to this list, I'll try to find a better way to send out patch and mails more friendly for all reader, .
I'll send out a demo kernel module which can trigger an overflow panic for testing, and also the patch will be updated as your comment in previous mail.
Thanks for your quickly reply.
Best regards
Hong
From: lijiang <lijiang@xxxxxxxxxx>
Sent: Monday, November 29, 2021 10:58 To: Hong YANG3 杨红 <hong.yang3@xxxxxxx>; Discussion list for crash utility usage, maintenance and development <crash-utility@xxxxxxxxxx> Subject: Re: arm64: Support overflow stack panic 注意:此封邮件来自于公司外部,请注意信息安全!
Attention: This email comes from outside of the company, please pay attention to the information security! Hi, Hong
Thank you for the patch. I added the comments below, other changes look good to me.
@@ -1978,7 +2028,10 @@ arm64_in_exception_text(ulong ptr)
if ((ptr >= ms->__exception_text_start) && (ptr < ms->__exception_text_end)) return TRUE; - } else if ((name = closest_symbol(ptr))) { /* Linux 5.5 and later */ + } + + name = closest_symbol(ptr); + if (name != NULL) { /* Linux 5.5 and later */ The above changes are irrelevant to your patch itself. But anyway this looks more readable to me.
for (func = &arm64_exception_functions[0]; *func; func++) { if (STREQ(name, *func)) return TRUE; @@ -2255,12 +2308,14 @@ arm64_unwind_frame(struct bt_info *bt, struct arm64_stackframe *frame) if (!(machdep->flags & IRQ_STACKS)) return TRUE; - if (!(machdep->flags & IRQ_STACKS)) + if (!(machdep->flags & OVERFLOW_STACKS)) return TRUE; Originally, it had two same(repeated) statements, one of which must be redundant. This time, can it be changed to a statement as below?
if (!(machdep->flags & (IRQ_STACKS | OVERFLOW_STACKS)))
return TRUE;
BTW: this patch was sent as an attachment, which is inconvenient for other reviewers to add comments.
In addition, I have a request: can you share the vmcore with me if it doesn't have confidential data? I'm collecting the specific vmcore
for the test, at least I haven't reproduced it.
Thanks.
Lianbo
|
From 5c04bffd220240d5e2fa09d522062f8798eb42a9 Mon Sep 17 00:00:00 2001 From: Hong YANG <hong.yang3@xxxxxxx> Date: Mon, 15 Nov 2021 15:41:01 +0800 Subject: [PATCH] arm64: Support overflow stack panic Overflow stack exception handling supported since kernel 4.14 in commit 872d8327ce8, this patch trying to load the overflow_stack information on startup and dump back trace from the overflow stack. Before: KERNEL: vmlinux DUMPFILE: core.file CPUS: 8 DATE: Mon Nov 29 15:49:26 CST 2021 UPTIME: 00:02:51 LOAD AVERAGE: 1.02, 0.88, 0.37 TASKS: 1857 NODENAME: localhost RELEASE: 4.14.156+ VERSION: #1 SMP PREEMPT Thu Nov 25 13:07:21 UTC 2021 MACHINE: aarch64 (unknown Mhz) MEMORY: 8 GB PANIC: "Kernel panic - not syncing: kernel stack overflow" PID: 3607 COMMAND: "sh" TASK: ffffffcbf9a4da00 [THREAD_INFO: ffffffcbf9a4da00] CPU: 2 STATE: TASK_RUNNING (PANIC) crash-7.3.0.orig> bt PID: 3607 TASK: ffffffcbf9a4da00 CPU: 2 COMMAND: "sh" Segmentation fault (core dumped) After: crash> bt PID: 3607 TASK: ffffffcbf9a4da00 CPU: 2 COMMAND: "sh" #0 [ffffffccbfd85f50] __delay at ffffff8008ceded8 ... #5 [ffffffccbfd85fd0] emergency_restart at ffffff80080d49fc #6 [ffffffccbfd86140] panic at ffffff80080af4c0 #7 [ffffffccbfd86150] nmi_panic at ffffff80080af150 #8 [ffffffccbfd86190] handle_bad_stack at ffffff800808b0b8 #9 [ffffffccbfd862d0] __bad_stack at ffffff800808285c PC: ffffff8008082e80 [el1_sync] LR: ffffff8000d6c214 [stack_overflow_demo+84] SP: ffffff1a79930070 PSTATE: 204003c5 X29: ffffff8011b03d00 X28: ffffffcbf9a4da00 X27: ffffff8008e02000 X26: 0000000000000040 X25: 0000000000000124 X24: ffffffcbf9a4da00 X23: 0000007daec2e288 X22: ffffffcbfe03b800 X21: 0000007daec2e288 X20: 0000000000000002 X19: 0000000000000002 X18: 0000000000000002 X17: 00000000000003e7 X16: 0000000000000000 X15: 0000000000000000 X14: ffffffcc17facb00 X13: ffffffccb4c25c00 X12: 0000000000000000 X11: ffffffcc17fad660 X10: 0000000000000af0 X9: 0000000000000000 X8: ffffff1a799334f0 X7: 0000000000000000 X6: 000000000000003f X5: 0000000000000040 X4: 0000000000000010 X3: 00000065981d07f0 X2: 00000065981d07f0 X1: 0000000000000000 X0: ffffff1a799334f0 --- <Overflow stack> --- #10 [ffffff8011b03d00] el1_error_invalid at ffffff8008082e7c #11 [ffffff8011b03d60] write_enable at ffffff8000d6c134 [pso] #12 [ffffff8011b03da0] full_proxy_write at ffffff800839b2fc #13 [ffffff8011b03e30] __vfs_write at ffffff800823f4d8 #14 [ffffff8011b03e70] vfs_write at ffffff800823f874 #15 [ffffff8011b03eb0] sys_write at ffffff800823fa68 #16 [ffffff8011b03ff0] el0_svc_naked at ffffff800808387c PC: 0000007daf070244 LR: 000000648253a090 SP: 0000007ff2a00fa0 X29: 0000007ff2a01050 X28: 000000648257e000 X27: 0000007ff2a00fb0 X26: 000000648257fdb9 X25: 0000000000000000 X24: 0000007ff2a00fc8 X23: 0000007ff2a00fd0 X22: 000000648253d270 X21: 000000648257e080 X20: 0000007daec2e288 X19: 0000000000000002 X18: 0000000000000008 X17: 0000007daf07023c X16: 000000648257de48 X15: aaaaaaaaaaaaaaab X14: 0000000000000800 X13: 0000007ff2a01040 X12: 0000007daec0d848 X11: 0000000000000003 X10: 0000007daec2e289 X9: 0000007daec2e288 X8: 0000000000000040 X7: 0000000000000000 X6: 0000000000000031 X5: 0000007daec2c32a X4: 0000007daec34768 X3: 0000007daec2e1e8 X2: 0000000000000002 X1: 0000007daec2e288 X0: 0000000000000001 ORIG_X0: 0000000000000001 SYSCALLNO: 40 PSTATE: 80001000 Signed-off-by: Hong YANG <hong.yang3@xxxxxxx> --- arm64.c | 169 ++++++++++++++++++++++++++++++++++++++++++++++++++------ defs.h | 6 ++ 2 files changed, 159 insertions(+), 16 deletions(-) diff --git a/arm64.c b/arm64.c index 94681d1..23c3d75 100644 --- a/arm64.c +++ b/arm64.c @@ -45,6 +45,7 @@ static int arm64_vtop_3level_4k(ulong, ulong, physaddr_t *, int); static int arm64_vtop_4level_4k(ulong, ulong, physaddr_t *, int); static ulong arm64_get_task_pgd(ulong); static void arm64_irq_stack_init(void); +static void arm64_overflow_stack_init(void); static void arm64_stackframe_init(void); static int arm64_eframe_search(struct bt_info *); static int arm64_is_kernel_exception_frame(struct bt_info *, ulong); @@ -63,6 +64,7 @@ static int arm64_get_dumpfile_stackframe(struct bt_info *, struct arm64_stackfra static int arm64_in_kdump_text(struct bt_info *, struct arm64_stackframe *); static int arm64_in_kdump_text_on_irq_stack(struct bt_info *); static int arm64_switch_stack(struct bt_info *, struct arm64_stackframe *, FILE *); +static int arm64_switch_stack_from_overflow(struct bt_info *, struct arm64_stackframe *, FILE *); static int arm64_get_stackframe(struct bt_info *, struct arm64_stackframe *); static void arm64_get_stack_frame(struct bt_info *, ulong *, ulong *); static void arm64_gen_hidden_frame(struct bt_info *bt, ulong, struct arm64_stackframe *); @@ -78,8 +80,11 @@ static int arm64_get_smp_cpus(void); static void arm64_clear_machdep_cache(void); static int arm64_on_process_stack(struct bt_info *, ulong); static int arm64_in_alternate_stack(int, ulong); +static int arm64_in_alternate_stackv(int cpu, ulong stkptr, ulong *stacks, ulong stack_size); static int arm64_on_irq_stack(int, ulong); +static int arm64_on_overflow_stack(int, ulong); static void arm64_set_irq_stack(struct bt_info *); +static void arm64_set_overflow_stack(struct bt_info *); static void arm64_set_process_stack(struct bt_info *); static int arm64_get_kvaddr_ranges(struct vaddr_range *); static void arm64_get_crash_notes(void); @@ -463,6 +468,7 @@ arm64_init(int when) machdep->hz = 100; arm64_irq_stack_init(); + arm64_overflow_stack_init(); arm64_stackframe_init(); break; @@ -1715,6 +1721,49 @@ arm64_irq_stack_init(void) } } +/* + * Gather Overflow stack values. + * + * Overflow stack supported since 4.14, in commit 872d8327c + */ +static void +arm64_overflow_stack_init(void) +{ + int i; + struct syment *sp; + struct gnu_request request, *req; + struct machine_specific *ms = machdep->machspec; + req = &request; + + if (symbol_exists("overflow_stack") && + (sp = per_cpu_symbol_search("overflow_stack")) && + get_symbol_type("overflow_stack", NULL, req)) { + if (CRASHDEBUG(1)) { + fprintf(fp, "overflow_stack: \n"); + fprintf(fp, " type: %x, %s\n", + (int)req->typecode, + (req->typecode == TYPE_CODE_ARRAY) ? + "TYPE_CODE_ARRAY" : "other"); + fprintf(fp, " target_typecode: %x, %s\n", + (int)req->target_typecode, + req->target_typecode == TYPE_CODE_INT ? + "TYPE_CODE_INT" : "other"); + fprintf(fp, " target_length: %ld\n", + req->target_length); + fprintf(fp, " length: %ld\n", req->length); + } + + if (!(ms->overflow_stacks = (ulong *)malloc((size_t)(kt->cpus * sizeof(ulong))))) + error(FATAL, "cannot malloc overflow_stack addresses\n"); + + ms->overflow_stack_size = ARM64_OVERFLOW_STACK_SIZE; + machdep->flags |= OVERFLOW_STACKS; + + for (i = 0; i < kt->cpus; i++) + ms->overflow_stacks[i] = kt->__per_cpu_offset[i] + sp->value; + } +} + /* * Gather and verify all of the backtrace requirements. */ @@ -1960,6 +2009,7 @@ static char *arm64_exception_functions[] = { "do_mem_abort", "do_el0_irq_bp_hardening", "do_sp_pc_abort", + "handle_bad_stack", NULL }; @@ -1978,7 +2028,10 @@ arm64_in_exception_text(ulong ptr) if ((ptr >= ms->__exception_text_start) && (ptr < ms->__exception_text_end)) return TRUE; - } else if ((name = closest_symbol(ptr))) { /* Linux 5.5 and later */ + } + + name = closest_symbol(ptr); + if (name != NULL) { /* Linux 5.5 and later */ for (func = &arm64_exception_functions[0]; *func; func++) { if (STREQ(name, *func)) return TRUE; @@ -2252,15 +2305,14 @@ arm64_unwind_frame(struct bt_info *bt, struct arm64_stackframe *frame) if ((frame->fp == 0) && (frame->pc == 0)) return FALSE; - if (!(machdep->flags & IRQ_STACKS)) - return TRUE; - - if (!(machdep->flags & IRQ_STACKS)) + if (!(machdep->flags & (IRQ_STACKS | OVERFLOW_STACKS))) return TRUE; if (machdep->flags & UNW_4_14) { - if ((bt->flags & BT_IRQSTACK) && - !arm64_on_irq_stack(bt->tc->processor, frame->fp)) { + if (((bt->flags & BT_IRQSTACK) && + !arm64_on_irq_stack(bt->tc->processor, frame->fp)) || + ((bt->flags & BT_OVERFLOW_STACK) && + !arm64_on_overflow_stack(bt->tc->processor, frame->fp))) { if (arm64_on_process_stack(bt, frame->fp)) { arm64_set_process_stack(bt); @@ -2677,6 +2729,9 @@ arm64_back_trace_cmd(struct bt_info *bt) if (arm64_on_irq_stack(bt->tc->processor, bt->frameptr)) { arm64_set_irq_stack(bt); bt->flags |= BT_IRQSTACK; + } else if (arm64_on_overflow_stack(bt->tc->processor, bt->frameptr)) { + arm64_set_overflow_stack(bt); + bt->flags |= BT_OVERFLOW_STACK; } stackframe.sp = bt->stkptr; stackframe.pc = bt->instptr; @@ -2731,7 +2786,9 @@ arm64_back_trace_cmd(struct bt_info *bt) break; if (arm64_in_exception_text(bt->instptr) && INSTACK(stackframe.fp, bt)) { - if (!(bt->flags & BT_IRQSTACK) || + if (bt->flags & BT_OVERFLOW_STACK) { + exception_frame = stackframe.fp - KERN_EFRAME_OFFSET; + } else if (!(bt->flags & BT_IRQSTACK) || ((stackframe.sp + SIZE(pt_regs)) < bt->stacktop)) { if (arm64_is_kernel_exception_frame(bt, stackframe.fp - KERN_EFRAME_OFFSET)) exception_frame = stackframe.fp - KERN_EFRAME_OFFSET; @@ -2745,6 +2802,12 @@ arm64_back_trace_cmd(struct bt_info *bt) break; } + if ((bt->flags & BT_OVERFLOW_STACK) && + !arm64_on_overflow_stack(bt->tc->processor, stackframe.fp)) { + bt->flags &= ~BT_OVERFLOW_STACK; + if (arm64_switch_stack_from_overflow(bt, &stackframe, ofp) == USER_MODE) + break; + } level++; } @@ -3131,6 +3194,43 @@ arm64_switch_stack(struct bt_info *bt, struct arm64_stackframe *frame, FILE *ofp return KERNEL_MODE; } +static int +arm64_switch_stack_from_overflow(struct bt_info *bt, struct arm64_stackframe *frame, FILE *ofp) +{ + int i; + ulong stacktop, words, addr; + ulong *stackbuf; + char buf[BUFSIZE]; + struct machine_specific *ms = machdep->machspec; + + if (bt->flags & BT_FULL) { + stacktop = ms->overflow_stacks[bt->tc->processor] + ms->overflow_stack_size; + words = (stacktop - bt->bptr) / sizeof(ulong); + stackbuf = (ulong *)GETBUF(words * sizeof(ulong)); + readmem(bt->bptr, KVADDR, stackbuf, words * sizeof(long), + "top of overflow stack", FAULT_ON_ERROR); + + addr = bt->bptr; + for (i = 0; i < words; i++) { + if (!(i & 1)) + fprintf(ofp, "%s %lx: ", i ? "\n" : "", addr); + fprintf(ofp, "%s ", format_stack_entry(bt, buf, stackbuf[i], 0)); + addr += sizeof(ulong); + } + fprintf(ofp, "\n"); + FREEBUF(stackbuf); + } + fprintf(ofp, "--- <Overflow stack> ---\n"); + + if (frame->fp == 0) + return USER_MODE; + + if (!(machdep->flags & UNW_4_14)) + arm64_print_exception_frame(bt, frame->sp, KERNEL_MODE, ofp); + + return KERNEL_MODE; +} + static int arm64_get_dumpfile_stackframe(struct bt_info *bt, struct arm64_stackframe *frame) { @@ -3682,6 +3782,16 @@ arm64_display_machine_stats(void) machdep->machspec->irq_stacks[i]); } } + if (machdep->machspec->overflow_stack_size) { + fprintf(fp, "OVERFLOW STACK SIZE: %ld\n", + machdep->machspec->overflow_stack_size); + fprintf(fp, " OVERFLOW STACKS:\n"); + for (i = 0; i < kt->cpus; i++) { + pad = (i < 10) ? 3 : (i < 100) ? 2 : (i < 1000) ? 1 : 0; + fprintf(fp, "%s CPU %d: %lx\n", space(pad), i, + machdep->machspec->overflow_stacks[i]); + } + } } static int @@ -3875,24 +3985,41 @@ arm64_on_process_stack(struct bt_info *bt, ulong stkptr) } static int -arm64_on_irq_stack(int cpu, ulong stkptr) +arm64_in_alternate_stackv(int cpu, ulong stkptr, ulong *stacks, ulong stack_size) { - return arm64_in_alternate_stack(cpu, stkptr); + if ((cpu >= kt->cpus) || (stacks == NULL) || !stack_size) + return FALSE; + + if ((stkptr >= stacks[cpu]) && + (stkptr < (stacks[cpu] + stack_size))) + return TRUE; + + return FALSE; } static int arm64_in_alternate_stack(int cpu, ulong stkptr) +{ + return (arm64_on_irq_stack(cpu, stkptr) || + arm64_on_overflow_stack(cpu, stkptr)); +} + +static int +arm64_on_irq_stack(int cpu, ulong stkptr) { struct machine_specific *ms = machdep->machspec; - if (!ms->irq_stack_size || (cpu >= kt->cpus)) - return FALSE; + return arm64_in_alternate_stackv(cpu, stkptr, + ms->irq_stacks, ms->irq_stack_size); +} - if ((stkptr >= ms->irq_stacks[cpu]) && - (stkptr < (ms->irq_stacks[cpu] + ms->irq_stack_size))) - return TRUE; +static int +arm64_on_overflow_stack(int cpu, ulong stkptr) +{ + struct machine_specific *ms = machdep->machspec; - return FALSE; + return arm64_in_alternate_stackv(cpu, stkptr, + ms->overflow_stacks, ms->overflow_stack_size); } static void @@ -3905,6 +4032,16 @@ arm64_set_irq_stack(struct bt_info *bt) alter_stackbuf(bt); } +static void +arm64_set_overflow_stack(struct bt_info *bt) +{ + struct machine_specific *ms = machdep->machspec; + + bt->stackbase = ms->overflow_stacks[bt->tc->processor]; + bt->stacktop = bt->stackbase + ms->overflow_stack_size; + alter_stackbuf(bt); +} + static void arm64_set_process_stack(struct bt_info *bt) { diff --git a/defs.h b/defs.h index a2f3085..7e2a16e 100644 --- a/defs.h +++ b/defs.h @@ -3218,6 +3218,7 @@ typedef signed int s32; #define UNW_4_14 (0x200) #define FLIPPED_VM (0x400) #define HAS_PHYSVIRT_OFFSET (0x800) +#define OVERFLOW_STACKS (0x1000) /* * Get kimage_voffset from /dev/crash @@ -3260,6 +3261,7 @@ typedef signed int s32; #define ARM64_STACK_SIZE (16384) #define ARM64_IRQ_STACK_SIZE ARM64_STACK_SIZE +#define ARM64_OVERFLOW_STACK_SIZE (4096) #define _SECTION_SIZE_BITS 30 #define _SECTION_SIZE_BITS_5_12 27 @@ -3332,6 +3334,9 @@ struct machine_specific { char *irq_stackbuf; ulong __irqentry_text_start; ulong __irqentry_text_end; + ulong overflow_stack_size; + ulong *overflow_stacks; + char *overflow_stackbuf; /* for exception vector code */ ulong exp_entry1_start; ulong exp_entry1_end; @@ -5770,6 +5775,7 @@ ulong cpu_map_addr(const char *type); #define BT_CPUMASK (0x1000000000000ULL) #define BT_SHOW_ALL_REGS (0x2000000000000ULL) #define BT_REGS_NOT_FOUND (0x4000000000000ULL) +#define BT_OVERFLOW_STACK (0x8000000000000ULL) #define BT_SYMBOL_OFFSET (BT_SYMBOLIC_ARGS) #define BT_REF_HEXVAL (0x1) -- 2.25.1
-- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://listman.redhat.com/mailman/listinfo/crash-utility