On Tue, Oct 4, 2022 at 10:22 PM Lianbo Jiang <lijiang@xxxxxxxxxx> wrote: > > Currently crash will fail and then exit, if the initialization of > the emergency stacks information fails. In real customer environments, > sometimes, a vmcore may be partially damaged, although such vmcores > are rare. For example: > > # ./crash ../3.10.0-1127.18.2.el7.ppc64le/vmcore ../3.10.0-1127.18.2.el7.ppc64le/vmlinux -s > crash: invalid kernel virtual address: 38 type: "paca->emergency_sp" > # > > Lets try to keep loading vmcore if such issues happen, so call > the readmem() with the RETURN_ON_ERROR instead of FAULT_ON_ERROR, > which allows the crash move on. > > Reported-by: Dave Wysochanski <dwysocha@xxxxxxxxxx> > Signed-off-by: Lianbo Jiang <lijiang@xxxxxxxxxx> > --- > ppc64.c | 12 ++++++------ > 1 file changed, 6 insertions(+), 6 deletions(-) > > diff --git a/ppc64.c b/ppc64.c > index 4ea1f7c0c6f8..f94b402ec64d 100644 > --- a/ppc64.c > +++ b/ppc64.c > @@ -1224,13 +1224,13 @@ ppc64_init_paca_info(void) > ulong paca_loc; > > readmem(symbol_value("paca_ptrs"), KVADDR, &paca_loc, sizeof(void *), > - "paca double pointer", FAULT_ON_ERROR); > + "paca double pointer", RETURN_ON_ERROR); > readmem(paca_loc, KVADDR, paca_ptr, sizeof(void *) * kt->cpus, > - "paca pointers", FAULT_ON_ERROR); > + "paca pointers", RETURN_ON_ERROR); > } else if (symbol_exists("paca") && > (get_symbol_type("paca", NULL, NULL) == TYPE_CODE_PTR)) { > readmem(symbol_value("paca"), KVADDR, paca_ptr, sizeof(void *) * kt->cpus, > - "paca pointers", FAULT_ON_ERROR); > + "paca pointers", RETURN_ON_ERROR); > } else { > free(paca_ptr); > return; > @@ -1245,7 +1245,7 @@ ppc64_init_paca_info(void) > for (i = 0; i < kt->cpus; i++) > readmem(paca_ptr[i] + offset, KVADDR, &ms->emergency_sp[i], > sizeof(void *), "paca->emergency_sp", > - FAULT_ON_ERROR); > + RETURN_ON_ERROR); > } > > if (MEMBER_EXISTS("paca_struct", "nmi_emergency_sp")) { > @@ -1256,7 +1256,7 @@ ppc64_init_paca_info(void) > for (i = 0; i < kt->cpus; i++) > readmem(paca_ptr[i] + offset, KVADDR, &ms->nmi_emergency_sp[i], > sizeof(void *), "paca->nmi_emergency_sp", > - FAULT_ON_ERROR); > + RETURN_ON_ERROR); > } > > if (MEMBER_EXISTS("paca_struct", "mc_emergency_sp")) { > @@ -1267,7 +1267,7 @@ ppc64_init_paca_info(void) > for (i = 0; i < kt->cpus; i++) > readmem(paca_ptr[i] + offset, KVADDR, &ms->mc_emergency_sp[i], > sizeof(void *), "paca->mc_emergency_sp", > - FAULT_ON_ERROR); > + RETURN_ON_ERROR); > } > > free(paca_ptr); > -- > 2.37.1 > Consider adding a 'Fixes' tag for the patch that introduced this problem, per the bisect in https://bugzilla.redhat.com/show_bug.cgi?id=2127525 Fixes: cdd57e8b16ab ("ppc64: handle backtrace when CPU is in an emergency stack") Other than that, I tested this and now on the vmcore in question, crash loads ok. I get a lot of the below "invalid kernel virtual address", but I think it is fine: ... crash: invalid kernel virtual address: 2d0 type: "paca->mc_emergency_sp" crash> quit Tested-and-Reviewed-by: Dave Wysochanski <dwysocha@xxxxxxxxxx> Good job Lianbo! -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://listman.redhat.com/mailman/listinfo/crash-utility Contribution Guidelines: https://github.com/crash-utility/crash/wiki