On 2022/10/06 9:42, lijiang wrote: > Hi, Kazu > Thank you for the comment. > On Wed, Oct 5, 2022 at 5:11 PM HAGIO KAZUHITO(萩尾 一仁) <k-hagio-ab@xxxxxxx> > wrote: > >> Hi Lianbo, >> >> On 2022/10/05 11:22, Lianbo Jiang wrote: >>> Currently crash will fail and then exit, if the initialization of >>> the emergency stacks information fails. In real customer environments, >>> sometimes, a vmcore may be partially damaged, although such vmcores >>> are rare. For example: >>> >>> # ./crash ../3.10.0-1127.18.2.el7.ppc64le/vmcore >> ../3.10.0-1127.18.2.el7.ppc64le/vmlinux -s >>> crash: invalid kernel virtual address: 38 type: "paca->emergency_sp" >>> # >>> >>> Lets try to keep loading vmcore if such issues happen, so call >>> the readmem() with the RETURN_ON_ERROR instead of FAULT_ON_ERROR, >>> which allows the crash move on. >> >> Just to confirm, can I have the error messages printed with the patch >> for the vmcore? >> > > Yes. Crash has the following error messages printed after applying this > patch: > > # ./crash ../vmcore ../vmlinux -s > crash: invalid kernel virtual address: 38 type: "paca->emergency_sp" > crash: invalid kernel virtual address: 38 type: "paca->emergency_sp" > crash: invalid kernel virtual address: 38 type: "paca->emergency_sp" > crash: invalid kernel virtual address: 38 type: "paca->emergency_sp" > crash: invalid kernel virtual address: fc4 type: "paca->emergency_sp" > crash: invalid kernel virtual address: 38 type: "paca->emergency_sp" > crash: invalid kernel virtual address: 38 type: "paca->emergency_sp" > crash: invalid kernel virtual address: 38 type: "paca->emergency_sp" > crash: invalid kernel virtual address: 38 type: "paca->emergency_sp" > crash: invalid kernel virtual address: 38 type: "paca->emergency_sp" > crash: invalid kernel virtual address: 38 type: "paca->emergency_sp" > crash: invalid kernel virtual address: 38 type: "paca->emergency_sp" > crash: invalid kernel virtual address: 38 type: "paca->emergency_sp" > crash: invalid kernel virtual address: 38 type: "paca->emergency_sp" > crash: invalid kernel virtual address: 38 type: "paca->emergency_sp" > crash: invalid kernel virtual address: 38 type: "paca->emergency_sp" > crash: invalid kernel virtual address: 38 type: "paca->emergency_sp" > crash: invalid kernel virtual address: 38 type: "paca->emergency_sp" > crash: invalid kernel virtual address: 38 type: "paca->emergency_sp" > crash: invalid kernel virtual address: 38 type: "paca->emergency_sp" > crash: invalid kernel virtual address: 1000138 type: "paca->emergency_sp" > crash: invalid kernel virtual address: 100000040 type: "paca->emergency_sp" > ... Thanks, I see. These errors can be printed for the number of CPUs, but better than nothing. Looking at the users of ms->*emergency_sp, when the errors occur, 1. ppc64_in_emergency_stack() proceed with the initialized value 0, and the following condition will happen to fail: if (addr >= base && addr < top) { but I would like to add this to check it in advance: --- a/ppc64.c +++ b/ppc64.c @@ -1947,7 +1947,7 @@ ppc64_in_emergency_stack(int cpu, ulong addr, bool verbose) if (cpu < 0 || cpu >= kt->cpus) return NONE_STACK; - if (ms->emergency_sp) { + if (ms->emergency_sp && ms->emergency_sp[cpu]) { top = ms->emergency_sp[cpu]; base = top - STACKSIZE(); if (addr >= base && addr < top) { @@ -1957,7 +1957,7 @@ ppc64_in_emergency_stack(int cpu, ulong addr, bool verbose) } } - if (ms->nmi_emergency_sp) { + if (ms->nmi_emergency_sp && ms->nmi_emergency_sp[cpu]) { top = ms->nmi_emergency_sp[cpu]; base = top - STACKSIZE(); if (addr >= base && addr < top) { @@ -1967,7 +1967,7 @@ ppc64_in_emergency_stack(int cpu, ulong addr, bool verbose) } } - if (ms->mc_emergency_sp) { + if (ms->mc_emergency_sp && ms->mc_emergency_sp[cpu]) { top = ms->mc_emergency_sp[cpu]; base = top - STACKSIZE(); if (addr >= base && addr < top) { 2. ppc64_set_bt_emergency_stack() is called after ppc64_in_emergency_stack() returning other than NONE_STACK, so it should have a valid value. No need to check it again. So, with the additional change above, Acked-by: Kazuhito Hagio <k-hagio-ab@xxxxxxx> Lianbo, if the above is ok for you, please apply with it. Thanks, Kazu -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://listman.redhat.com/mailman/listinfo/crash-utility Contribution Guidelines: https://github.com/crash-utility/crash/wiki