On Mon, 10 Aug 2015 11:42:26 -0400 (EDT) Dave Anderson <anderson@xxxxxxxxxx> wrote: > > > ----- Original Message ----- > > On Mon, 10 Aug 2015 10:32:12 -0400 (EDT) > > Dave Anderson <anderson@xxxxxxxxxx> wrote: > > > > > > > > > > > ----- Original Message ----- > > > > > > > > On Thu, 6 Aug 2015 11:25:29 -0400 (EDT) > > > > Dave Anderson <anderson@xxxxxxxxxx> wrote: > > > > > > > > > Re: your dumpfile where the erroneous "panic" address in a random user > > > > > task's exception frame register set gets picked up by mistake. > > > > > > > > > > Your original patch request modified the "bt" command used for the > > > > > kernel stack searches in panic_search(). But that piece of code > > > > > is the last-ditch effort for finding a panic task, which follows > > > > > this path: > > > > > > > > > > get_panic_context() > > > > > panic_search() > > > > > get_dumpfile_panic_task() > > > > > get_kdump_panic_task() (requires kdump "crashing_cpu" symbol) > > > > > get_diskdump_panic_task() (requires kdump "crashing_cpu" symbol) > > > > > > > > On s390 we don't have the "crashing_cpu" symbol in the kernel. > > > > > > > > > get_active_set_panic_task() (bt -r raw stack dump of active cpus) > > > > > ... > > > > > > > > > > Only if all of the above fail, does panic_search() initiate the > > > > > exhaustive walkthrough of all kernel stacks for evidence. > > > > > > > > > > Since you have gotten that far, I'm wondering whether your > > > > > target dumpfile with the faulty "panic" address is from an > > > > > s390x "live dump"? In that case, there can never be any task > > > > > with any such evidence, making the backtrace search a waste of > > > > > time to begin with. > > > > > > > > The "problem" dump is a s390 stand-alone dump of a hanging system. > > > > All CPUs have been in "psw_idle" when the dump was generated: > > > > > > > > PID: 0 TASK: c50f38 CPU: 0 COMMAND: "swapper/0" > > > > LOWCORE INFO: > > > > -psw : 0x0706c00180000000 0x000000000084410e > > > > -function : psw_idle at 84410e > > > > > > > > [snip] > > > > > > > > #0 [00c1fe70] arch_cpu_idle at 104d4a > > > > #1 [00c1fe90] cpu_startup_entry at 180430 > > > > #2 [00c1fee8] start_kernel at d1fb10 > > > > #3 [00c1ff60] _stext at 100020 > > > > > > > > > > > > > > > > > > And if so, I'm thinking that since s390x will have set LIVE_DUMP > > > > > flag set, if get_dumpfile_panic_task() returns NO_TASK, then > > > > > panic_search() should just return a NULL to get_panic_context() > > > > > if it's a live dump, which will just default to the idle task on > > > > > cpu 0. > > > > > > > > Although it does not solve the above problem it makes sense for > > > > live dumps. What about the following patch? > > > > --- > > > > crash: do not search panic tasks for live dumps > > > > > > > > Always return "NO_TASK" if the "LIVE_DUMP" flag is set because live dumps > > > > cannot have a panic task. > > > > > > > > Signed-off-by: Michael Holzheu <holzheu@xxxxxxxxxxxxxxxxxx> > > > > --- > > > > task.c | 5 ++++- > > > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > > > > > --- a/task.c > > > > +++ b/task.c > > > > @@ -6726,7 +6726,10 @@ get_dumpfile_panic_task(void) > > > > { > > > > ulong task; > > > > > > > > - if (NETDUMP_DUMPFILE()) { > > > > + if (pc->flags2 & LIVE_DUMP) { > > > > + /* No panic task because system itself created the dump */ > > > > + return NO_TASK; > > > > + } else if (NETDUMP_DUMPFILE()) { > > > > task = pc->flags & REM_NETDUMP ? > > > > tt->panic_task : get_netdump_panic_task(); > > > > if (task) > > > > > > > > > > That makes sense, but I'm going to move the LIVE_DUMP check farther down > > > in get_dumpfile_panic_task() to just before the get_active_set() call. > > > > > > > Makes sense. That was also my first idea. > > > > > The reason for that another type of "LIVE_DUMP" is from the snap.so extension > > > module, and in that case, get_kdump_panic_task() finds and returns the "crash" > > > task that was running the snap command on the live system. > > > > > > Clarify something else for me: are there actually two types of live dumps > > > that can be taken by an s390x? There is the "zgetdump" facility, but is > > > there also another type that is taken by the firmware and/or the > > > hypervisor? > > > > With the zgetdump tool we create live dumps from /dev/mem or /dev/crash. > > These dumps get the LIVE_DUMP flag indicating that data is not consistent. > > > > Besides of this, we have two other non-disruptive live dump features: > > > > - VMDUMP for z/VM guests > > - Virsh dump for KVM guests > > > > In contrast to the zgetdump method here the guest system is stopped > > to get consistent snapshots. Therefore I think it is fine to *not* set > > the LIVE_DUMP flag. > > > > Besides of those live dump mechanisms (and kdump) we have our stand-alone dump > > tools for DASD and SCSI. Also these dump methods are "Linux independent" and > > therefore can produce dumps without panic tasks. > > > > You can read more on s390 dump in the documents below: > > > > * http://www.vm.ibm.com/education/lvc/LVC1219.pdf > > * > > http://www-01.ibm.com/support/knowledgecenter/linuxonibm/liaaf/lnz_r_dt.html?cp=linuxonibm%2F0-4-0-1 > > > > Michael > > OK, so from what I understand, there still can be s390x dumpfiles which have no indication > of the panic task or cpu (if there is one) in their headers, and therefore may try the "bt -r" > type search of the active tasks via raw_stack_dump() in get_active_set_panic_task(), > and if that fails, fall back to the "bt -t" search of all tasks in panic_search(). > > In those cases, I suppose you could: > > (1) restrict the raw_stack_dump() parameters in get_active_set_panic_task() to exclude > the user register dump at the top of the stack, and > (2) plug in a MACHDEP_BT_TEXT handler for the s390x instead of using the generic version, > and in that case, could prevent the search from entering the user-space register dump > at the top of the stack, or > (2a) replace "bt -t" with just "bt" in panic_search() for s390x as you did in the original > patch. > > But (1) and (2) are not fool-proof, because even the kernel-only part of the stack could > simply contain "numbers" that by dumb luck fall into the zero-based virtual address > range of panic, crash_kexec, etc., and return a false positive. So I don't know > how that can be made absolutely reliable. I still would prefer 2a. See patch below. > > But at least with dumpfiles that have the live dump magic number (and I'm still > not clear which of the 4 types do so), Only the zgetdump live dump gets the live dump magic number. > the simple LIVE_PATCH-check patch covers > them. I'm not sure whether it's worth doing anything beyond that. --- crash: Do not use bt -t flag in panic_search() On s390 we got a dump where a process "gmain" was incorrectly marked as running panic task: crash> ps | grep gmain > 217 1 5 8bec23420 IN 0.0 463276 18240 gmain The reason was that the "brute force" way parsing the "bt -t -o" output in panic_search() found the symbol "panic" on the stack: crash> bt -t -o 8bec23420 PID: 217 TASK: 8bec23420 CPU: 5 COMMAND: "gmain" START: __schedule at 83f650 [ 8b662b900] (null) at 0 [ 8b662b978] __schedule at 83f650 ... [ 8b662bb18] (null) at 0 [ 8b662bb40] panic at 83679a <<<<<-------------- The real stack trace was as follows: crash> bt 8bec23420 Detaching after fork from child process 15508. PID: 217 TASK: 8bec23420 CPU: 5 COMMAND: "gmain" #0 [8b662b8f0] __schedule at 83f650 #1 [8b662b958] schedule at 83fade #2 [8b662b970] schedule_hrtimeout_range_clock at 842fc8 #3 [8b662ba10] poll_schedule_timeout at 2c6e8a #4 [8b662ba30] do_sys_poll at 2c8604 #5 [8b662be40] sys_poll at 2c8852 #6 [8b662bea8] system_call at 843a66 The value 0x83679a (panic at 83679a) was a local variable on the stack and was interpreted incorrectly as function call to "panic". Especially for s390 there are dump methods, e.g. VMDUMP or stand-alone dump, where the "bt -t -o" method will be used to find the panic task. Therefore and because the "-t" method is quite risky, we use the "normal" stack backtrace without the "-t" bt option for s390. Signed-off-by: Michael Holzheu <holzheu@xxxxxxxxxxxxxxxxxx> --- task.c | 4 ++++ 1 file changed, 4 insertions(+) --- a/task.c +++ b/task.c @@ -6633,7 +6633,11 @@ panic_search(void) fd = &foreach_data; fd->keys = 1; fd->keyword_array[0] = FOREACH_BT; +#ifdef S390X + fd->flags |= FOREACH_o_FLAG; +#else fd->flags |= (FOREACH_t_FLAG|FOREACH_o_FLAG); +#endif dietask = lasttask = NO_TASK;
crash: Do not use bt -t flag in panic_search() On s390 we got a dump where a process "gmain" was incorrectly marked as running panic task: crash> ps | grep gmain > 217 1 5 8bec23420 IN 0.0 463276 18240 gmain The reason was that the "brute force" way parsing the "bt -t -o" output in panic_search() found the symbol "panic" on the stack: crash> bt -t -o 8bec23420 PID: 217 TASK: 8bec23420 CPU: 5 COMMAND: "gmain" START: __schedule at 83f650 [ 8b662b900] (null) at 0 [ 8b662b978] __schedule at 83f650 ... [ 8b662bb18] (null) at 0 [ 8b662bb40] panic at 83679a <<<<<-------------- The real stack trace was as follows: crash> bt 8bec23420 Detaching after fork from child process 15508. PID: 217 TASK: 8bec23420 CPU: 5 COMMAND: "gmain" #0 [8b662b8f0] __schedule at 83f650 #1 [8b662b958] schedule at 83fade #2 [8b662b970] schedule_hrtimeout_range_clock at 842fc8 #3 [8b662ba10] poll_schedule_timeout at 2c6e8a #4 [8b662ba30] do_sys_poll at 2c8604 #5 [8b662be40] sys_poll at 2c8852 #6 [8b662bea8] system_call at 843a66 The value 0x83679a (panic at 83679a) was a local variable on the stack and was interpreted incorrectly as function call to "panic". Especially for s390 there are dump methods, e.g. VMDUMP or stand-alone dump, where the "bt -t -o" method will be used to find the panic task. Therefore and because the "-t" method is quite risky, we use the "normal" stack backtrace without the "-t" bt option for s390. Signed-off-by: Michael Holzheu <holzheu@xxxxxxxxxxxxxxxxxx> --- task.c | 4 ++++ 1 file changed, 4 insertions(+) --- a/task.c +++ b/task.c @@ -6633,7 +6633,11 @@ panic_search(void) fd = &foreach_data; fd->keys = 1; fd->keyword_array[0] = FOREACH_BT; +#ifdef S390X + fd->flags |= FOREACH_o_FLAG; +#else fd->flags |= (FOREACH_t_FLAG|FOREACH_o_FLAG); +#endif dietask = lasttask = NO_TASK;
-- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/crash-utility