----- Original Message ----- > > > > > > With the zgetdump tool we create live dumps from /dev/mem or /dev/crash. > > > These dumps get the LIVE_DUMP flag indicating that data is not > > > consistent. > > > > > > Besides of this, we have two other non-disruptive live dump features: > > > > > > - VMDUMP for z/VM guests > > > - Virsh dump for KVM guests > > > > > > In contrast to the zgetdump method here the guest system is stopped > > > to get consistent snapshots. Therefore I think it is fine to *not* set > > > the LIVE_DUMP flag. > > > > > > Besides of those live dump mechanisms (and kdump) we have our stand-alone dump > > > tools for DASD and SCSI. Also these dump methods are "Linux independent" and > > > therefore can produce dumps without panic tasks. > > > > > > You can read more on s390 dump in the documents below: > > > > > > * http://www.vm.ibm.com/education/lvc/LVC1219.pdf > > > * http://www-01.ibm.com/support/knowledgecenter/linuxonibm/liaaf/lnz_r_dt.html?cp=linuxonibm%2F0-4-0-1 > > > > > > Michael > > > > OK, so from what I understand, there still can be s390x dumpfiles which have no indication > > of the panic task or cpu (if there is one) in their headers, and therefore may try the "bt -r" > > type search of the active tasks via raw_stack_dump() in get_active_set_panic_task(), > > and if that fails, fall back to the "bt -t" search of all tasks in panic_search(). > > > > In those cases, I suppose you could: > > > > (1) restrict the raw_stack_dump() parameters in > > get_active_set_panic_task() to exclude > > the user register dump at the top of the stack, and > > (2) plug in a MACHDEP_BT_TEXT handler for the s390x instead of using the generic version, > > and in that case, could prevent the search from entering the user-space register dump > > at the top of the stack, or > > (2a) replace "bt -t" with just "bt" in panic_search() for s390x as you did in the original > > patch. > > > > But (1) and (2) are not fool-proof, because even the kernel-only part of the stack could > > simply contain "numbers" that by dumb luck fall into the zero-based virtual address > > range of panic, crash_kexec, etc., and return a false positive. So I don't know > > how that can be made absolutely reliable. > > I still would prefer 2a. See patch below. OK, that's fine with me. > > > > > But at least with dumpfiles that have the live dump magic number (and I'm still > > not clear which of the 4 types do so), > > Only the zgetdump live dump gets the live dump magic number. OK, thanks for the clarification -- I'll update the changelog to indicate that. Queued for crash-7.1.3: https://github.com/crash-utility/crash/commit/3c2fc5f2a027fe192327101cdc6db0e24a4794d9 Thanks, Dave > > the simple LIVE_PATCH-check patch covers > > them. I'm not sure whether it's worth doing anything beyond that. > --- > crash: Do not use bt -t flag in panic_search() > > On s390 we got a dump where a process "gmain" was incorrectly marked as > running panic task: > > crash> ps | grep gmain > > 217 1 5 8bec23420 IN 0.0 463276 18240 gmain > > The reason was that the "brute force" way parsing the "bt -t -o" > output in panic_search() found the symbol "panic" on the stack: > > crash> bt -t -o 8bec23420 > PID: 217 TASK: 8bec23420 CPU: 5 COMMAND: "gmain" > START: __schedule at 83f650 > [ 8b662b900] (null) at 0 > [ 8b662b978] __schedule at 83f650 > ... > [ 8b662bb18] (null) at 0 > [ 8b662bb40] panic at 83679a <<<<<-------------- > > The real stack trace was as follows: > > crash> bt 8bec23420 > Detaching after fork from child process 15508. > PID: 217 TASK: 8bec23420 CPU: 5 COMMAND: "gmain" > #0 [8b662b8f0] __schedule at 83f650 > #1 [8b662b958] schedule at 83fade > #2 [8b662b970] schedule_hrtimeout_range_clock at 842fc8 > #3 [8b662ba10] poll_schedule_timeout at 2c6e8a > #4 [8b662ba30] do_sys_poll at 2c8604 > #5 [8b662be40] sys_poll at 2c8852 > #6 [8b662bea8] system_call at 843a66 > > The value 0x83679a (panic at 83679a) was a local variable on the stack > and was interpreted incorrectly as function call to "panic". > > Especially for s390 there are dump methods, e.g. VMDUMP or stand-alone dump, > where the "bt -t -o" method will be used to find the panic task. Therefore > and because the "-t" method is quite risky, we use the "normal" stack > backtrace without the "-t" bt option for s390. > > Signed-off-by: Michael Holzheu <holzheu@xxxxxxxxxxxxxxxxxx> > --- > task.c | 4 ++++ > 1 file changed, 4 insertions(+) > > --- a/task.c > +++ b/task.c > @@ -6633,7 +6633,11 @@ panic_search(void) > fd = &foreach_data; > fd->keys = 1; > fd->keyword_array[0] = FOREACH_BT; > +#ifdef S390X > + fd->flags |= FOREACH_o_FLAG; > +#else > fd->flags |= (FOREACH_t_FLAG|FOREACH_o_FLAG); > +#endif > > dietask = lasttask = NO_TASK; > > -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/crash-utility