----- Original Message ----- > Thank you for the guidance Dave. > > I have two questions regarding runq. > > 1. Could you please let me know how the active task has spent more time than > uptime on some CPUs? I'm not sure, other than the uptime is calculated based upon jiffies, and the runq -m option uses each run queue's per-cpu timestamp. > > crash> runq -m > CPU 0: [0 00:23:29.808] PID: 529 TASK: ffff88079d0d1e40 COMMAND: "kworker/u141:1" > CPU 1: [1 12:10:42.840] PID: 0 TASK: ffff88079df48000 COMMAND: "swapper/1" > CPU 2: [1 12:10:42.841] PID: 0 TASK: ffff88079df4bc80 COMMAND: "swapper/2" > CPU 3: [1 12:10:42.841] PID: 0 TASK: ffff88079df4dac0 COMMAND: "swapper/3" > CPU 4: [1 12:10:42.841] PID: 0 TASK: ffff88079df49e40 COMMAND: "swapper/4" > CPU 5: [1 12:10:42.841] PID: 0 TASK: ffff88079df58000 COMMAND: "swapper/5" > > crash> sys > KERNEL: ./usr/lib/debug/usr/lib/modules/4.14.19-coreos/vmlinux > DUMPFILE: gt-user2-gmt-612746ca.vmss > CPUS: 70 > DATE: Wed Feb 21 14:53:20 2018 > UPTIME: 1 days, 11:52:25 > LOAD AVERAGE: 70.70, 30.98, 12.88 > TASKS: 2312 > NODENAME: gt-user2-gmt.com > RELEASE: 4.14.19-coreos > VERSION: #1 SMP Wed Feb 14 03:18:05 UTC 2018 > MACHINE: x86_64 (2094 Mhz) > MEMORY: 60 GB > PANIC: "" > crash> > > 2. Is there a way to:: find out why some CPUs have time lag in run queue ? I don't know, but I would certainly look at the task backtraces of the cpus that have the large lag values. > > CPU 32: 0.00 secs > CPU 65: 0.00 secs > CPU 54: 0.00 secs > CPU 0: 0.01 secs > CPU 16: 84.22 secs > CPU 66: 268.75 secs > CPU 58: 268.75 secs > CPU 57: 268.75 secs > CPU 43: 268.75 secs > CPU 20: 268.75 secs > CPU 7: 268.75 secs > > crash> > I'm struggling to find out why my VM hung(unresponsive to ping/ssh and couple > of CPUs at 100% utilization). > > -Eshak > > On Thu, Feb 22, 2018 at 6:27 AM, Dave Anderson < anderson@xxxxxxxxxx > wrote: > > > ----- Original Message ----- > > Hello Dave, > > > > I got a kernel freeze yesterday and am able to successfully open the memory > > image using crash utility. > > > > crash> sys > > KERNEL: ./usr/lib/debug/usr/lib/modules/4.14.19-coreos/vmlinux > > DUMPFILE: gt-Server02-gmt-612746ca.vmss > > CPUS: 70 > > DATE: Wed Feb 21 14:53:20 2018 > > UPTIME: 1 days, 11:52:25 > > LOAD AVERAGE: 70.70, 30.98, 12.88 > > TASKS: 2312 > > NODENAME: gt-Server02-gmt.com > > RELEASE: 4.14.19-coreos > > VERSION: #1 SMP Wed Feb 14 03:18:05 UTC 2018 > > MACHINE: x86_64 (2094 Mhz) > > MEMORY: 60 GB > > PANIC: "" > > crash> > > > > Could you please guide me about couple of things I should check in case of > > a kernel freeze before diving in deep to find the root cause ? > > I'm not sure what you mean by a "kernel freeze", but typically something > would complain about a hard or soft lockup in the system log. So I would > first run "log" to see if there's anything of interest. Run "bt -a" on > the active tasks to see if the active tasks are contesting for something, > or work your way through "foreach bt" to see what the tasks of interest are > doing/waiting on. It would seem that some task has taken control of > something, > a lock, or counter, or whatever, and many other tasks have blocked waiting > for its release. So there's probably a common theme among the blocked tasks > that might give you a clue. > > Dave > > -- > Crash-utility mailing list > Crash-utility@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/crash-utility > > > -- > Crash-utility mailing list > Crash-utility@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/crash-utility -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/crash-utility