Re: [PATCH] crash: Do not use bt -t flag in panic_search()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 6 Aug 2015 11:25:29 -0400 (EDT)
Dave Anderson <anderson@xxxxxxxxxx> wrote:

> Re: your dumpfile where the erroneous "panic" address in a random user
> task's exception frame register set gets picked up by mistake.  
> 
> Your original patch request modified the "bt" command used for the
> kernel stack searches in panic_search().  But that piece of code
> is the last-ditch effort for finding a panic task, which follows 
> this path:
> 
>   get_panic_context()
>     panic_search()
>       get_dumpfile_panic_task()
>         get_kdump_panic_task()       (requires kdump "crashing_cpu" symbol)
>         get_diskdump_panic_task()    (requires kdump "crashing_cpu" symbol)

On s390 we don't have the "crashing_cpu" symbol in the kernel.

>         get_active_set_panic_task()  (bt -r raw stack dump of active cpus)
>     ...
>       
> Only if all of the above fail, does panic_search() initiate the 
> exhaustive walkthrough of all kernel stacks for evidence.
> 
> Since you have gotten that far, I'm wondering whether your
> target dumpfile with the faulty "panic" address is from an
> s390x "live dump"?  In that case, there can never be any task 
> with any such evidence, making the backtrace search a waste of 
> time to begin with.

The "problem" dump is a s390 stand-alone dump of a hanging system.
All CPUs have been in "psw_idle" when the dump was generated:

PID: 0      TASK: c50f38            CPU: 0   COMMAND: "swapper/0"
 LOWCORE INFO:
  -psw      : 0x0706c00180000000 0x000000000084410e
  -function : psw_idle at 84410e

[snip]

 #0 [00c1fe70] arch_cpu_idle at 104d4a
 #1 [00c1fe90] cpu_startup_entry at 180430
 #2 [00c1fee8] start_kernel at d1fb10
 #3 [00c1ff60] _stext at 100020


> 
> And if so, I'm thinking that since s390x will have set LIVE_DUMP 
> flag set, if get_dumpfile_panic_task() returns NO_TASK, then 
> panic_search() should just return a NULL to get_panic_context()
> if it's a live dump, which will just default to the idle task on
> cpu 0.

Although it does not solve the above problem it makes sense for
live dumps. What about the following patch?
---
crash: do not search panic tasks for live dumps

Always return "NO_TASK" if the "LIVE_DUMP" flag is set because live dumps
cannot have a panic task.

Signed-off-by: Michael Holzheu <holzheu@xxxxxxxxxxxxxxxxxx>
---
 task.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

--- a/task.c
+++ b/task.c
@@ -6726,7 +6726,10 @@ get_dumpfile_panic_task(void)
 {
 	ulong task;
 
-	if (NETDUMP_DUMPFILE()) {
+	if (pc->flags2 & LIVE_DUMP) {
+		/* No panic task because system itself created the dump */
+		return NO_TASK;
+	} else if (NETDUMP_DUMPFILE()) {
 		task = pc->flags & REM_NETDUMP ?
 			tt->panic_task : get_netdump_panic_task();
 		if (task) 
crash: do not search panic tasks for live dumps

Always return "NO_TASK" if the "LIVE_DUMP" flag is set because live dumps
cannot have a panic task.

Signed-off-by: Michael Holzheu <holzheu@xxxxxxxxxxxxxxxxxx>
---
 task.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

--- a/task.c
+++ b/task.c
@@ -6726,7 +6726,10 @@ get_dumpfile_panic_task(void)
 {
 	ulong task;
 
-	if (NETDUMP_DUMPFILE()) {
+	if (pc->flags2 & LIVE_DUMP) {
+		/* No panic task because system itself created the dump */
+		return NO_TASK;
+	} else if (NETDUMP_DUMPFILE()) {
 		task = pc->flags & REM_NETDUMP ?
 			tt->panic_task : get_netdump_panic_task();
 		if (task) 
--
Crash-utility mailing list
Crash-utility@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/crash-utility

[Index of Archives]     [Fedora Development]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [KDE Users]     [Fedora Tools]

 

Powered by Linux