Hi Anthony, With respect to the nr_running and h_nr_running displays, since you can "see" the number of tasks queued underneath each particular group, I'm not convinced that it's worth displaying them? In your first post you mentioned: > Since the way we crash the system by messing up the nr_running and h_nr_running, > so we also display those two fields at the same time. Here’s an example of before and after. Are you saying that you purposely modify those two values in order to force a crash? Anyway, I bring this up because their display is kind of ugly, and also because in the output logs of my test of your patch, I see this particular instance, where I've got a 3.6.0 kernel where a crash was generated by entering "echo c > /proc/sysrq-trigger": crash> bt PID: 1212 TASK: ffff880035f60000 CPU: 1 COMMAND: "bash" #0 [ffff88007831fa20] machine_kexec at ffffffff8103e465 #1 [ffff88007831fa90] crash_kexec at ffffffff810c6658 #2 [ffff88007831fb60] oops_end at ffffffff815d5bf8 #3 [ffff88007831fb90] no_context at ffffffff815c7dae #4 [ffff88007831fbf0] __bad_area_nosemaphore at ffffffff815c7f98 #5 [ffff88007831fc40] bad_area at ffffffff815c81f0 #6 [ffff88007831fc70] do_page_fault at ffffffff815d87d1 #7 [ffff88007831fd80] page_fault at ffffffff815d5025 [exception RIP: sysrq_handle_crash+22] RIP: ffffffff81388986 RSP: ffff88007831fe38 RFLAGS: 00010092 RAX: 000000000000000f RBX: ffffffff8192dc20 RCX: 00000000000014ff RDX: 000000000000332f RSI: 0000000000000046 RDI: 0000000000000063 RBP: ffff88007831fe38 R8: ffffffff81b26580 R9: 0000000000000397 R10: 0000000000000002 R11: 0000000000000396 R12: 0000000000000063 R13: 0000000000000286 R14: 0000000000000000 R15: 0000000000000007 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #8 [ffff88007831fe40] __handle_sysrq at ffffffff813890a7 #9 [ffff88007831fe80] write_sysrq_trigger at ffffffff8138915a #10 [ffff88007831feb0] proc_reg_write at ffffffff811ea879 #11 [ffff88007831ff00] vfs_write at ffffffff8118991c #12 [ffff88007831ff30] sys_write at ffffffff81189c4a #13 [ffff88007831ff80] system_call_fastpath at ffffffff815dcae9 RIP: 00007f64d1a94530 RSP: 00007fffbb0c1248 RFLAGS: 00010246 RAX: 0000000000000001 RBX: ffffffff815dcae9 RCX: 00000000fbad2a84 RDX: 0000000000000002 RSI: 00007f64d23ab000 RDI: 0000000000000001 RBP: 00007f64d23ab000 R8: 000000000000000a R9: 00007f64d23a4740 R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000002 R13: 00007f64d1d61280 R14: 0000000000000002 R15: 00007f64d1d61280 ORIG_RAX: 0000000000000001 CS: 0033 SS: 002b crash> The "runq -g" output for that cpu looks like this: CPU 1 CURRENT: PID: 1212 CFS: ffff880035cc2f00 TASK: ffff880035f60000 COMMAND: "bash" TASK_GROUP RT_RQ: ffff88007fa541e8 RT PRIO_ARRAY: ffff88007fa541e8 [no tasks queued] TASK_GROUP CFS_RQ: ffff88007fa540f0 CFS RB_ROOT: ffff88007fa54118 GROUP: ffff880078af7800 CFS_RQ: ffff880035cc2f00 RB_ROOT: ffff880035cc2f28 nr_running: 4294967297 h_nr_running: 201908650262921217 [120] PID: 1212 TASK: ffff880035f60000 COMMAND: "bash" I don't understand where those values are coming from, because if I look at the CFS_RQ, it shows this: crash> cfs_rq.nr_running,h_nr_running ffff880035cc2f00 nr_running = 1 h_nr_running = 1 crash> I also see this occurring on live "snapshot" dumps -- which I understand given that the kernel's runqueue data structures are being changed while the dump is being created. But I don't understand why it's happening in the situation above. Dave ----- Original Message ----- > > > ----- Original Message ----- > > Hi Dave, > > > > I have cleaned up the code and added another change. > > OK thanks -- the patch runs through my sample set of vmcores with no problem. > > > The current running task is not in the rb tree (rb_root), so run -q > > displays it like: > > > > CURRENT: PID: 9048 TASK: ffff8808b07e4200 COMMAND: "actmain" > > TASK_GROUP RT_RQ: ffff880002493820 > > RT PRIO_ARRAY: ffff880002493820 > > [no tasks queued] > > TASK_GROUP CFS_RQ: ffff8800024936e0 > > CFS RB_ROOT: ffff880002493710 > > GROUP CFS RB_ROOT: ffff882d609ce830 <TDAT> > > GROUP CFS RB_ROOT: ffff883f0bcbfa30 <User> > > [no tasks queued] > > > > I can understand why the current running task is not displayed. > > However, the "-g" option displays all the task_groups the task > > belongs to but at the end it shows "[no tasks queued]". That is > > just strange. The new change is to display the task that is running like: > > > > CURRENT: PID: 9048 CFS: ffff88039351a800 TASK: ffff8808b07e4200 > > COMMAND: "actmain" > > TASK_GROUP RT_RQ: ffff880002493820 > > RT PRIO_ARRAY: ffff880002493820 > > [no tasks queued] > > TASK_GROUP CFS_RQ: ffff8800024936e0 > > CFS RB_ROOT: ffff880002493710 > > GROUP: ffff884052bc9800 CFS_RQ: ffff882d609ce800 RB_ROOT: > > ffff882d609ce830 <TDAT> nr_running: 1 h_nr_running: 1 > > GROUP: ffff884058f1d000 CFS_RQ: ffff883f0bcbfa00 RB_ROOT: > > ffff883f0bcbfa30 <User> nr_running: 1 h_nr_running: 1 > > [120] PID: 9048 TASK: ffff8808b07e4200 COMMAND: "actmain" > > OK -- I guess I understand why it probably makes sense to duplicate the > CURRENT task underneath its own GROUP list -- but if that is done, then > why clutter the CURRENT line with the CFS_RQ address? And it's not clear > to me why in your example above, the CFS address of ffff88039351a800 > doesn't show up as the CFS_RQ address above the "actmain" line? > > Taking a simple example, I see this: > > crash> runq -g > CPU 0 > CURRENT: PID: 0 CFS: ffff88000c7d6aa8 TASK: ffffffff8178ba60 COMMAND: > "swapper" > TASK_GROUP RT_RQ: ffff88000c7d6b58 > RT PRIO_ARRAY: ffff88000c7d6b58 > [no tasks queued] > TASK_GROUP CFS_RQ: ffff88000c7d6aa8 > CFS RB_ROOT: ffff88000c7d6ad0 > [no tasks queued] > > CPU 1 > CURRENT: PID: 1268 CFS: ffff88000c9b5aa8 TASK: ffff88002f11c620 COMMAND: > "bash" > TASK_GROUP RT_RQ: ffff88000c9b5b58 > RT PRIO_ARRAY: ffff88000c9b5b58 > [no tasks queued] > TASK_GROUP CFS_RQ: ffff88000c9b5aa8 > CFS RB_ROOT: ffff88000c9b5ad0 > [120] PID: 1268 TASK: ffff88002f11c620 COMMAND: "bash" > > crash> > > Where the newly-interspersed CFS address redundantly shows the TASK_GROUP > CFS_RQ > below. But adding the CFS address to the "swapper" line doesn't seem to make > much sense, or help in any way, since the idle task is a special case that > never > gets queued. And since the CFS address in the "bash" line is redundant with > the > TASK_GROUP CFS_RQ below, why bother showing it? > > And in a more complicated example, with your patch, the "qemu-kvm" task also > shows up underneath its group: > > CPU 0 > CURRENT: PID: 3144 CFS: ffff88022aab2600 TASK: ffff88022a446040 COMMAND: > "qemu-kvm" > TASK_GROUP RT_RQ: ffff880133c16148 > RT PRIO_ARRAY: ffff880133c16148 > [no tasks queued] > TASK_GROUP CFS_RQ: ffff880133c16028 > CFS RB_ROOT: ffff880133c16058 > GROUP: ffff88012b880800 CFS_RQ: ffff88022ac8f000 RB_ROOT: > ffff88022ac8f030 <libvirt> nr_running: 1 h_nr_running: 1 > GROUP: ffff88012c078000 CFS_RQ: ffff88022c075000 RB_ROOT: > ffff88022c075030 <qemu> nr_running: 1 h_nr_running: 1 > GROUP: ffff88012b0fb400 CFS_RQ: ffff88022af94c00 RB_ROOT: > ffff88022af94c30 <guest1> nr_running: 1 h_nr_running: 1 > GROUP: ffff88022c6bbc00 CFS_RQ: ffff88022aab2600 RB_ROOT: > ffff88022aab2630 <vcpu1> nr_running: 1 h_nr_running: 1 > [120] PID: 3144 TASK: ffff88022a446040 COMMAND: > "qemu-kvm" > > And note that its interspersed CFS address of ffff88022aab2600 is redundantly > shown > as the CFS_RQ of its GROUP down below. > > So I don't understand why your example shows different CFS addresses in the > CURRENT line vs. the GROUP CFS_RQ address above the queued "acctmain" task: > > > CURRENT: PID: 9048 CFS: ffff88039351a800 TASK: ffff8808b07e4200 > > COMMAND: "actmain" > > TASK_GROUP RT_RQ: ffff880002493820 > > RT PRIO_ARRAY: ffff880002493820 > > [no tasks queued] > > TASK_GROUP CFS_RQ: ffff8800024936e0 > > CFS RB_ROOT: ffff880002493710 > > GROUP: ffff884052bc9800 CFS_RQ: ffff882d609ce800 RB_ROOT: > > ffff882d609ce830 <TDAT> nr_running: 1 h_nr_running: 1 > > GROUP: ffff884058f1d000 CFS_RQ: ffff883f0bcbfa00 RB_ROOT: > > ffff883f0bcbfa30 <User> nr_running: 1 h_nr_running: 1 > > [120] PID: 9048 TASK: ffff8808b07e4200 COMMAND: "actmain" > > Am I missing something? Or is there cut-and-paste error? > > Dave > > -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/crash-utility