----- Original Message ----- > > > > > Hi, > > > > We’re debugging an in-house application that makes use of hard limit > extensively. We ran into a lot of timing windows (all our own making) and we > use runq -g command a lot. The current runq -g display only the RBROOT > pointer. It really is a bit inconvenient to traverse the task_group > hierarchy. It would be nice to have the command also display the > corresponding task_group, cfs_rq pointer (at a minimum). Since the way we > crash the system by messing up the nr_running and h_nr_running, so we also > display those two fields at the same time. Here’s an example of before and > after. > > > > CPU 4 > > CURRENT: PID: 0 TASK: ffff8840668c0380 COMMAND: "swapper" > > TASK_GROUP RT_RQ: ffff8800027d3820 > > RT PRIO_ARRAY: ffff8800027d3820 > > [no tasks queued] > > TASK_GROUP CFS_RQ: ffff8800027d36e0 > > CFS RB_ROOT: ffff8800027d3710 > > GROUP CFS RB_ROOT: ffff883ff69bcc30 <TDAT> > > GROUP CFS RB_ROOT: ffff884006290e30 <User> > > GROUP CFS RB_ROOT: ffff88400641c430 <TDWMVP1> > > GROUP CFS RB_ROOT: ffff88400646b030 <ServDown1:0> > > GROUP CFS RB_ROOT: ffff884006492630 <ServOrder1:1> > > GROUP CFS RB_ROOT: ffff883ff058fe30 <TDWMWD57> (THROTTLED) > > GROUP CFS RB_ROOT: ffff88047889ee30 <S:WD:3d:35fa> > > [120] PID: 27655 TASK: ffff8805078ce2c0 COMMAND: "actmain" > > <<< more throttled groups removed >>> > > > > CPU 4 > > CURRENT: PID: 0 CFS: ffff8800027d36e0 TASK: ffff8840668c0380 COMMAND: > "swapper" > > TASK_GROUP RT_RQ: ffff8800027d3820 > > RT PRIO_ARRAY: ffff8800027d3820 > > [no tasks queued] > > TASK_GROUP CFS_RQ: ffff8800027d36e0 > > CFS RB_ROOT: ffff8800027d3710 > > GROUP: ffff88405394d000 CFS: ffff883ff69bcc00 RB_ROOT: ffff883ff69bcc30 > <TDAT> (0 0) > > GROUP: ffff88405906c400 CFS: ffff884006290e00 RB_ROOT: ffff884006290e30 > <User> (0 0) > > GROUP: ffff884055081000 CFS: ffff88400641c400 RB_ROOT: ffff88400641c430 > <TDWMVP1> (0 0) > > GROUP: ffff884055081c00 CFS: ffff88400646b000 RB_ROOT: ffff88400646b030 > <ServDown1:0> (0 0) > > GROUP: ffff8840580fd400 CFS: ffff884006492600 RB_ROOT: ffff884006492630 > <ServOrder1:1> (0 0) > > GROUP: ffff884058f58c00 CFS: ffff883ff058fe00 RB_ROOT: ffff883ff058fe30 > <TDWMWD57> (7 9) (THROTTLED) > > GROUP: ffff8808fb976000 CFS: ffff88047889ee00 RB_ROOT: ffff88047889ee30 > <S:WD:3d:35fa> (1 1) > > [120] PID: 27655 TASK: ffff8805078ce2c0 COMMAND: "actmain" > > <<< more throttled groups removed >>> > > > > I have attached the patch we use to display additional information. Could you > please take a look at my proposal to see if it is possible that you include > this kind of display format. > > > > Thanks, > > Anthony A couple quick observations... Adding the CFS: address makes sense, but besides you and me and whoever reads the patch/code, how would anybody know what the two numbers inside the parentheses mean? It could be documented in the help page, but I wonder if it could be made more obvious somehow? Anyway, I ran a test on a sample set of vmcores, and this addition will only work with relevant kernel versions. For example, in these four dumpfiles (and therefore their kernel versions), the command fails because the cfs_rq.h_nr_running member does not exist: 2.6.38.2-9.fc15: CPU 0 CURRENT: PID: 1180 CFS: ffff880037ef1b00 TASK: ffff88003bea2e40 COMMAND: "crash" TASK_GROUP RT_RQ: ffff88003fc13988 RT PRIO_ARRAY: ffff88003fc13988 [no tasks queued] TASK_GROUP CFS_RQ: ffff88003fc138b0 CFS RB_ROOT: ffff88003fc138d8 GROUP: ffff88003c610a00 CFS: ffff880037ef1b00 RB_ROOT: ffff880037ef1b28 runq: invalid structure member offset: cfs_rq_h_nr_running FILE: task.c LINE: 7632 FUNCTION: print_group_header_fair() 2.6.40.4-5.fc15: CPU 1 CURRENT: PID: 1341 CFS: ffff880037592f00 TASK: ffff880037409730 COMMAND: "crash" TASK_GROUP RT_RQ: ffff88003fc92690 RT PRIO_ARRAY: ffff88003fc92690 [no tasks queued] TASK_GROUP CFS_RQ: ffff88003fc925b0 CFS RB_ROOT: ffff88003fc925d8 GROUP: ffff88003a84b800 CFS: ffff880037592f00 RB_ROOT: ffff880037592f28 runq: invalid structure member offset: cfs_rq_h_nr_running FILE: task.c LINE: 7632 FUNCTION: print_group_header_fair() 2.6.32-131.0.15.el6: CPU 0 CURRENT: PID: 28263 CFS: ffff8800794b7140 TASK: ffff880037aaa040 COMMAND: "loop.ABA" TASK_GROUP RT_RQ: ffff88000a216098 RT PRIO_ARRAY: ffff88000a216098 [no tasks queued] TASK_GROUP CFS_RQ: ffff88000a215fe8 CFS RB_ROOT: ffff88000a216010 GROUP: ffff8800785bc200 CFS: ffff8800784d7c80 RB_ROOT: ffff8800784d7ca8 <A> runq: invalid structure member offset: cfs_rq_h_nr_running FILE: task.c LINE: 7632 FUNCTION: print_group_header_fair() 3.1.7-1.fc16: CPU 2 CURRENT: PID: 1495 CFS: ffff8800277f8500 TASK: ffff880037a60000 COMMAND: "crash" TASK_GROUP RT_RQ: ffff88003e2532d0 RT PRIO_ARRAY: ffff88003e2532d0 [no tasks queued] TASK_GROUP CFS_RQ: ffff88003e2531f0 CFS RB_ROOT: ffff88003e253218 GROUP: ffff88002722b000 CFS: ffff8800277f8500 RB_ROOT: ffff8800277f8528 runq: invalid structure member offset: cfs_rq_h_nr_running FILE: task.c LINE: 7632 FUNCTION: print_group_header_fair() So you'll need to check VALID_MEMBER(cfs_rq_h_nr_running) before attempting to print it, and maybe just show the cfs_rq.nr_running value. When adding new items to the offset_table, they should be added to the end of the structure so that extension modules will continue to be able to access any offsets as expected without having to be re-compiled. But you can display the new item in dump_offset_table() wherever you'd like, typically grouped with similar/associated items -- although in your patch you did this: + fprintf(fp, " cfs_rq_nr_running: %ld\n", + OFFSET(cfs_rq_nr_running)); + fprintf(fp, " cfs_rq_h_nr_running: %ld\n", + OFFSET(cfs_rq_h_nr_running)); The previously-existing cfs_rq_nr_running display already exists, so I suggest that you just move your new cfs_rq_h_nr_running display underneath it. (and please line up the "help -o" output of the new item as well). Dave -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/crash-utility