于 2012年11月06日 05:44, Dave Anderson 写道: > > > ----- Original Message ----- >>>> >>>> I worry about the large number of kernel structure.member declarations that the >>>> command depends upon, because if just one of them changes, it breaks the command. >>>> So it would be preferable if the "runq" command (with no arguments) would only use >>>> a minimal set of structure.member offsets. >> >> I made two new offset-init functions for runq: >> >> static void cfs_rq_offset_init(void); >> static void task_group_offset_init(void); > > One minor suggestion -- it's really not necessary to call MEMBER_EXISTS() > prior to calling MEMBER_OFFSET_INIT(): > > diff --git a/kernel.c b/kernel.c > index 45da48e..868d777 100755 > --- a/kernel.c > +++ b/kernel.c > @@ -308,6 +308,12 @@ kernel_init() > STRUCT_SIZE_INIT(prio_array, "prio_array"); > > MEMBER_OFFSET_INIT(rq_cfs, "rq", "cfs"); > + if (MEMBER_EXISTS("task_group", "cfs_rq")) > + MEMBER_OFFSET_INIT(task_group_cfs_rq, "task_group", "cfs_rq"); > + if (MEMBER_EXISTS("task_group", "rt_rq")) > + MEMBER_OFFSET_INIT(task_group_rt_rq, "task_group", "rt_rq"); > + if (MEMBER_EXISTS("task_group", "parent")) > + MEMBER_OFFSET_INIT(task_group_parent, "task_group", "parent"); > > You can simply just call MEMBER_OFFSET_INIT() the 3 times above. If the structure > members don't exist, MEMBER_OFFSET_INIT() will just set the offsets to -1 > (INVALID_OFFSET). > > The same thing applies in your task_group_offset_init() function: > > + if (MEMBER_EXISTS("task_group", "cfs_bandwidth")) { > + MEMBER_OFFSET_INIT(task_group_cfs_bandwidth, > + "task_group", "cfs_bandwidth"); > + MEMBER_OFFSET_INIT(cfs_rq_throttled, "cfs_rq", > + "throttled"); > + } > + > + if (MEMBER_EXISTS("task_group", "rt_bandwidth")) { > + MEMBER_OFFSET_INIT(task_group_rt_bandwidth, > + "task_group", "rt_bandwidth"); > + MEMBER_OFFSET_INIT(rt_rq_rt_throttled, "rt_rq", > + "rt_throttled"); > + } > > Sometimes it helps to call MEMBER_EXISTS() for clarity's sake, but it's > actually kind of redundant. Thanks. Fixed. > >>> >>> And to follow up, I'm still running tests (and will do so overnight) on your latest >>> patch, but I immediately see this on any 2.6.30, 2.6.31 or 2.6.32 kernel, and on >>> some 2.6.36 and 2.6.38 kernels, where "runq -g" fails like this: >>> >>> crash> runq -g >>> runq: invalid kernel virtual address: 0 type: "dentry" >>> crash> >>> >> >> As you have pointed in another mail, this is fixed in patch v2, I think. > > Yes, that's fixed this time... > >> >>> And we certainly want to keep the group information separate from >>> the normal "runq" command. Here on a "live" 4 cpu Fedora 3.6.3 kernel, >>> the command output exceeds 1000 lines! I'm pretty sure that most >>> people will *not* want to see all of this: >>> >>> crash> runq -g >>> CPU 0 >>> CURRENT: PID: 0 TASK: ffffffff81c13420 COMMAND: "swapper/0" >>> RT PRIO_ARRAY: ffff88021e213e28 >>> [100] GROUP RT PRIO_ARRAY: ffff88020db22000 <system> >>> [100] GROUP RT PRIO_ARRAY: ffff8801f8180000 <udisks2.service> >>> [no tasks queued] >> >> <snip> >> >>> [no tasks queued] >>> GROUP CFS RB_ROOT: ffff88020a844128 >>> [no tasks queued] >>> crash> >>> >>> In fact, it's difficult to actually find a real *task* that is on >>> a run queue from among all of the empty "[no tasks queued]" groups! >>> >> >> I've noticed this in the first patch and patch V2 has fixed this. > > OK, although I didn't realize that was a bug? What were all of those > empty groups that are no longer shown? Are they actually *not* queued? Yes, cfs_rq and rt_rq owned by empty groups actually not queued. every group has its own cfs_rq and rt_rq, if there is no task in this group and no task in any of its child groups, it is an empty group and its cfs_rq and rt_rq will not be queued in the parent runqueue. actually, throttled cfs_rq(rt_rq) is also dequeued from its parent cfs_rq(rt_rq), but we should display them for there are task(s) in it. for example: 1 CPU 0 2 CURRENT: PID: 14734 TASK: ffff88010626f500 COMMAND: "sh" 3 RT PRIO_ARRAY: ffff880028216808 4 [ 0] GROUP RT PRIO_ARRAY: ffff880139fc9800 <test1> (THROTTLED) 5 [ 0] PID: 14750 TASK: ffff88013a4dd540 COMMAND: "rtloop99" 6 [ 1] PID: 14748 TASK: ffff88013bbca040 COMMAND: "rtloop98" 7 [ 1] GROUP RT PRIO_ARRAY: ffff880089029000 <test11> 8 [ 1] PID: 14752 TASK: ffff880088abf500 COMMAND: "rtloop98" 9 [ 54] PID: 14749 TASK: ffff880037a4e080 COMMAND: "rtloop45" 10 [ 98] PID: 14746 TASK: ffff88012678c080 COMMAND: "rtloop1" 11 CFS RB_ROOT: ffff88013fc23050 12 [120] PID: 14740 TASK: ffff88013b1e6080 COMMAND: "sh" 13 [120] PID: 14738 TASK: ffff88012678d540 COMMAND: "sh" 14 GROUP CFS RB_ROOT: ffff8800897af430 <test2> (THROTTLED) 15 [120] PID: 14732 TASK: ffff88013bbcb500 COMMAND: "sh" 16 [120] PID: 14728 TASK: ffff8800b3496080 COMMAND: "sh" 17 [120] PID: 14730 TASK: ffff880037833540 COMMAND: "sh" 18 GROUP CFS RB_ROOT: ffff880037943e30 <test1> (THROTTLED) 19 [120] PID: 14726 TASK: ffff880138d42aa0 COMMAND: "sh" > >> >> So I attach the new patch v2 version for runq -g. If you still find >> any bug in your tests or have any suggestion about it, that's very >> helpful. >> >> TODO: >> 1. The help info about the -g option. >> 2. Change rt_rq tasks displayed non-hierarchically. > > Like I mentioned above, the latest patch does not change the default > behavior of runq alone, and "runq -g" is not as verbose as the last > patch, which I presume is your intent. Two patches added. One is the fixed version for v2 and the other is adding the help info for run -g. Thanks Zhang
>From cb4ad37af801b9b7044baf03dd54e251433de33f Mon Sep 17 00:00:00 2001 From: Zhang Yanfei <zhangyanfei@xxxxxxxxxxxxxx> Date: Tue, 6 Nov 2012 16:24:01 +0800 Subject: [PATCH 1/2] add -g option for runq v3 Signed-off-by: Zhang Yanfei <zhangyanfei@xxxxxxxxxxxxxx> --- defs.h | 17 ++ kernel.c | 3 + symbols.c | 34 ++++ task.c | 612 +++++++++++++++++++++++++++++++++++++++++++++++++++++++------ 4 files changed, 612 insertions(+), 54 deletions(-) diff --git a/defs.h b/defs.h index 319584f..798dd9b 100755 --- a/defs.h +++ b/defs.h @@ -1792,6 +1792,22 @@ struct offset_table { /* stash of commonly-used offsets */ long sched_rt_entity_my_q; long neigh_table_hash_shift; long neigh_table_nht_ptr; + long task_group_parent; + long task_group_css; + long cgroup_subsys_state_cgroup; + long cgroup_dentry; + long task_group_rt_rq; + long rt_rq_tg; + long task_group_cfs_rq; + long cfs_rq_tg; + long task_group_siblings; + long task_group_children; + long task_group_cfs_bandwidth; + long cfs_rq_throttled; + long task_group_rt_bandwidth; + long rt_rq_rt_throttled; + long rt_rq_highest_prio; + long rt_rq_rt_nr_running; }; struct size_table { /* stash of commonly-used sizes */ @@ -1927,6 +1943,7 @@ struct size_table { /* stash of commonly-used sizes */ long log; long log_level; long rt_rq; + long task_group; }; struct array_table { diff --git a/kernel.c b/kernel.c index 45da48e..76441e9 100755 --- a/kernel.c +++ b/kernel.c @@ -308,6 +308,9 @@ kernel_init() STRUCT_SIZE_INIT(prio_array, "prio_array"); MEMBER_OFFSET_INIT(rq_cfs, "rq", "cfs"); + MEMBER_OFFSET_INIT(task_group_cfs_rq, "task_group", "cfs_rq"); + MEMBER_OFFSET_INIT(task_group_rt_rq, "task_group", "rt_rq"); + MEMBER_OFFSET_INIT(task_group_parent, "task_group", "parent"); /* * In 2.4, smp_send_stop() sets smp_num_cpus back to 1 diff --git a/symbols.c b/symbols.c index 1f09c9f..3179edc 100755 --- a/symbols.c +++ b/symbols.c @@ -8820,6 +8820,38 @@ dump_offset_table(char *spec, ulong makestruct) OFFSET(log_flags_level)); fprintf(fp, " sched_rt_entity_my_q: %ld\n", OFFSET(sched_rt_entity_my_q)); + fprintf(fp, " task_group_parent: %ld\n", + OFFSET(task_group_parent)); + fprintf(fp, " task_group_css: %ld\n", + OFFSET(task_group_css)); + fprintf(fp, " cgroup_subsys_state_cgroup: %ld\n", + OFFSET(cgroup_subsys_state_cgroup)); + fprintf(fp, " cgroup_dentry: %ld\n", + OFFSET(cgroup_dentry)); + fprintf(fp, " task_group_rt_rq: %ld\n", + OFFSET(task_group_rt_rq)); + fprintf(fp, " rt_rq_tg: %ld\n", + OFFSET(rt_rq_tg)); + fprintf(fp, " task_group_cfs_rq: %ld\n", + OFFSET(task_group_cfs_rq)); + fprintf(fp, " cfs_rq_tg: %ld\n", + OFFSET(cfs_rq_tg)); + fprintf(fp, " task_group_siblings: %ld\n", + OFFSET(task_group_siblings)); + fprintf(fp, " task_group_children: %ld\n", + OFFSET(task_group_children)); + fprintf(fp, " task_group_cfs_bandwidth: %ld\n", + OFFSET(task_group_cfs_bandwidth)); + fprintf(fp, " cfs_rq_throttled: %ld\n", + OFFSET(cfs_rq_throttled)); + fprintf(fp, " task_group_rt_bandwidth: %ld\n", + OFFSET(task_group_rt_bandwidth)); + fprintf(fp, " rt_rq_rt_throttled: %ld\n", + OFFSET(rt_rq_rt_throttled)); + fprintf(fp, " rt_rq_highest_prio: %ld\n", + OFFSET(rt_rq_highest_prio)); + fprintf(fp, " rt_rq_rt_nr_running: %ld\n", + OFFSET(rt_rq_rt_nr_running)); fprintf(fp, "\n size_table:\n"); fprintf(fp, " page: %ld\n", SIZE(page)); @@ -9037,6 +9069,8 @@ dump_offset_table(char *spec, ulong makestruct) SIZE(log_level)); fprintf(fp, " rt_rq: %ld\n", SIZE(rt_rq)); + fprintf(fp, " task_group: %ld\n", + SIZE(task_group)); fprintf(fp, "\n array_table:\n"); /* diff --git a/task.c b/task.c index f8c6325..967e90b 100755 --- a/task.c +++ b/task.c @@ -64,10 +64,23 @@ static struct rb_node *rb_parent(struct rb_node *, struct rb_node *); static struct rb_node *rb_right(struct rb_node *, struct rb_node *); static struct rb_node *rb_left(struct rb_node *, struct rb_node *); static void dump_task_runq_entry(struct task_context *); -static int dump_tasks_in_cfs_rq(ulong); +static void print_group_header_fair(int, ulong, void *); +static int dump_tasks_in_lower_dequeued_cfs_rq(int, ulong, int); +static int dump_tasks_in_cfs_rq(int, int, ulong, int); static void dump_on_rq_tasks(void); +static void cfs_rq_offset_init(void); +static void task_group_offset_init(void); static void dump_CFS_runqueues(void); -static void dump_RT_prio_array(int, ulong, char *); +static void print_group_header_rt(ulong, void *); +static int dump_tasks_in_lower_dequeued_rt_rq(int, ulong, int); +static void dump_RT_prio_array(int, int, ulong, char *, int); +static void get_task_group_name(ulong, char **); +static void sort_task_group_info_array(void *, int); +static void print_task_group_info_array(void *, int); +static void reuse_task_group_info_array(void *, int); +static void free_task_group_info_array(void *, int *); +static void fill_task_group_info_array(int, ulong, char *); +static void dump_tasks_by_task_group(void); static void task_struct_member(struct task_context *,unsigned int, struct reference *); static void signal_reference(struct task_context *, ulong, struct reference *); static void do_sig_thread_group(ulong); @@ -7028,8 +7041,9 @@ cmd_runq(void) int c; int sched_debug = 0; int dump_timestamp_flag = 0; + int dump_task_group_flag = 0; - while ((c = getopt(argcnt, args, "dt")) != EOF) { + while ((c = getopt(argcnt, args, "dtg")) != EOF) { switch(c) { case 'd': @@ -7038,6 +7052,13 @@ cmd_runq(void) case 't': dump_timestamp_flag = 1; break; + case 'g': + if (INVALID_MEMBER(task_group_cfs_rq) || + INVALID_MEMBER(task_group_rt_rq) || + INVALID_MEMBER(task_group_parent)) + option_not_supported(c); + dump_task_group_flag = 1; + break; default: argerrs++; break; @@ -7053,12 +7074,16 @@ cmd_runq(void) return; } - if (sched_debug) { dump_on_rq_tasks(); return; } + if (dump_task_group_flag) { + dump_tasks_by_task_group(); + return; + } + dump_runq(); } @@ -7421,6 +7446,85 @@ rb_next(struct rb_node *node) return parent; } +#define MAX_GROUP_NUM 200 +struct task_group_info { + int use; + int depth; + char *name; + ulong task_group; +}; + +static struct task_group_info *tgi_array; +static struct task_group_info *tgi_queue[MAX_GROUP_NUM / 10 ]; +static int tgi_p = 0; +static int tgi_q = 0; + +#define COPY_GROUP(t1, t2) \ +do { \ + t1.use = t2.use; \ + t1.task_group = t2.task_group; \ + t1.depth = t2.depth; \ + t1.name = t2.name; \ +} while (0); + +static void +sort_task_group_info_array(void *a, int len) +{ + int i, j; + struct task_group_info tmp; + struct task_group_info *array = (struct task_group_info *)a; + + for (i = 0; i < len - 1; i++) { + for (j = 0; j < len - i - 1; j++) { + if (array[j].depth > array[j+1].depth) { + COPY_GROUP(tmp, array[j+1]); + COPY_GROUP(array[j+1], array[j]); + COPY_GROUP(array[j], tmp); + } + } + } +} + +static void +print_task_group_info_array(void *a, int len) +{ + int i; + struct task_group_info *array = (struct task_group_info *)a; + + for (i = 0; i < len; i++) { + fprintf(fp, "%d : use=%d, depth=%d, group=%lx, ", i, + array[i].use, array[i].depth, array[i].task_group); + fprintf(fp, "name=%s\n", array[i].name ? array[i].name : "NULL"); + } +} + +static void +free_task_group_info_array(void *a, int *len) +{ + int i; + struct task_group_info *array = (struct task_group_info *)a; + + for (i = 0; i < *len; i++) { + if (array[i].name) + FREEBUF(array[i].name); + } + *len = 0; + FREEBUF(array); +} + +static void +reuse_task_group_info_array(void *a, int len) +{ + int i; + struct task_group_info *array = (struct task_group_info *)a; + + for (i = 0; i < len; i++) { + if (array[i].depth == 0) + array[i].use = 0; + array[i].use = 1; + } +} + static void dump_task_runq_entry(struct task_context *tc) { @@ -7428,22 +7532,120 @@ dump_task_runq_entry(struct task_context *tc) readmem(tc->task + OFFSET(task_struct_prio), KVADDR, &prio, sizeof(int), "task prio", FAULT_ON_ERROR); - fprintf(fp, " [%3d] ", prio); + fprintf(fp, "[%3d] ", prio); fprintf(fp, "PID: %-5ld TASK: %lx COMMAND: \"%s\"\n", tc->pid, tc->task, tc->comm); } +static void +print_group_header_fair(int depth, ulong cfs_rq, void *t) +{ + int throttled; + struct rb_root *root; + struct task_group_info *tgi = (struct task_group_info *)t; + + root = (struct rb_root *)(cfs_rq + OFFSET(cfs_rq_tasks_timeline)); + INDENT(2 + 3 * depth); + fprintf(fp, "GROUP CFS RB_ROOT: %lx", (ulong)root); + if (tgi->name) + fprintf(fp, " <%s>", tgi->name); + + if (VALID_MEMBER(task_group_cfs_bandwidth)) { + readmem(cfs_rq + OFFSET(cfs_rq_throttled), KVADDR, + &throttled, sizeof(int), "cfs_rq throttled", + FAULT_ON_ERROR); + if (throttled) + fprintf(fp, " (THROTTLED)"); + } + fprintf(fp, "\n"); +} + static int -dump_tasks_in_cfs_rq(ulong cfs_rq) +dump_tasks_in_lower_dequeued_cfs_rq(int depth, ulong cfs_rq, int cpu) +{ + int i, j, total, nr_running; + ulong t, p, cfs_rq_c, cfs_rq_p, tmp1, tmp2; + + total = 0; + for (i = 0; i < tgi_p; i++) { + if (tgi_array[i].use == 0 || tgi_array[i].depth - depth != 1) + continue; + + readmem(cfs_rq + OFFSET(cfs_rq_tg), KVADDR, &t, sizeof(ulong), + "cfs_rq tg", + FAULT_ON_ERROR); + readmem(tgi_array[i].task_group + OFFSET(task_group_parent), + KVADDR, &p, sizeof(ulong), "task_group parent", + FAULT_ON_ERROR); + if (t != p) + continue; + + readmem(tgi_array[i].task_group + OFFSET(task_group_cfs_rq), + KVADDR, &cfs_rq_c, sizeof(ulong), "task_group cfs_rq", + FAULT_ON_ERROR); + readmem(cfs_rq_c + cpu * sizeof(ulong), KVADDR, &cfs_rq_p, + sizeof(ulong), "task_group cfs_rq", FAULT_ON_ERROR); + if (cfs_rq == cfs_rq_p) + continue; + + tgi_array[i].use = 0; + + readmem(cfs_rq_p + OFFSET(cfs_rq_nr_running), KVADDR, + &nr_running, sizeof(int), "cfs_rq nr_running", + FAULT_ON_ERROR); + if (nr_running == 0) { + tgi_queue[tgi_q++] = &tgi_array[i]; + total += dump_tasks_in_lower_dequeued_cfs_rq(depth + 1, + cfs_rq_p, cpu); + continue; + } + + for (j = 0; j < tgi_q; j++) { + readmem(tgi_queue[j]->task_group + OFFSET(task_group_cfs_rq), + KVADDR, &tmp1, sizeof(ulong), "task_group cfs_rq", + FAULT_ON_ERROR); + readmem(tmp1 + cpu * sizeof(ulong), KVADDR, &tmp2, + sizeof(ulong), "task_group cfs_rq", FAULT_ON_ERROR); + + print_group_header_fair(tgi_queue[j]->depth, + tmp2, tgi_queue[j]); + } + tgi_q = 0; + + total++; + total += dump_tasks_in_cfs_rq(1, depth + 1, cfs_rq_p, cpu); + } + if (tgi_q > 0) + tgi_q--; + + return total; +} + +static int +dump_tasks_in_cfs_rq(int g_flag, int depth, ulong cfs_rq, int cpu) { struct task_context *tc; struct rb_root *root; struct rb_node *node; - ulong my_q, leftmost, curr, curr_my_q; - int total; + ulong my_q, leftmost, curr, curr_my_q, tg; + int total, i; total = 0; + if (g_flag && depth) { + readmem(cfs_rq + OFFSET(cfs_rq_tg), KVADDR, + &tg, sizeof(ulong), "cfs_rq tg", + FAULT_ON_ERROR); + for (i = 0; i < tgi_p; i++) { + if (tgi_array[i].task_group == tg) + break; + } + if (i < tgi_p) { + tgi_array[i].use = 0; + print_group_header_fair(depth, cfs_rq, &tgi_array[i]); + } + } + if (VALID_MEMBER(sched_entity_my_q)) { readmem(cfs_rq + OFFSET(cfs_rq_curr), KVADDR, &curr, sizeof(ulong), "curr", FAULT_ON_ERROR); @@ -7451,8 +7653,11 @@ dump_tasks_in_cfs_rq(ulong cfs_rq) readmem(curr + OFFSET(sched_entity_my_q), KVADDR, &curr_my_q, sizeof(ulong), "curr->my_q", FAULT_ON_ERROR); - if (curr_my_q) - total += dump_tasks_in_cfs_rq(curr_my_q); + if (curr_my_q) { + total += g_flag; + total += dump_tasks_in_cfs_rq(g_flag, depth + 1, + curr_my_q, cpu); + } } } @@ -7466,7 +7671,9 @@ dump_tasks_in_cfs_rq(ulong cfs_rq) + OFFSET(sched_entity_my_q), KVADDR, &my_q, sizeof(ulong), "my_q", FAULT_ON_ERROR); if (my_q) { - total += dump_tasks_in_cfs_rq(my_q); + total += g_flag; + total += dump_tasks_in_cfs_rq(g_flag, depth + 1, + my_q, cpu); continue; } } @@ -7475,9 +7682,11 @@ dump_tasks_in_cfs_rq(ulong cfs_rq) OFFSET(sched_entity_run_node)); if (!tc) continue; - if (hq_enter((ulong)tc)) + if (hq_enter((ulong)tc)) { + INDENT(5); + INDENT(g_flag ? 3 * depth : 0); dump_task_runq_entry(tc); - else { + } else { error(WARNING, "duplicate CFS runqueue node: task %lx\n", tc->task); return total; @@ -7485,6 +7694,16 @@ dump_tasks_in_cfs_rq(ulong cfs_rq) total++; } + if (g_flag) + total += dump_tasks_in_lower_dequeued_cfs_rq(depth, cfs_rq, cpu); + + if (!total) { + if ((!g_flag && !depth) || g_flag) { + INDENT(5); + INDENT(g_flag ? 3 * depth : 0); + fprintf(fp, "[no tasks queued]\n"); + } + } return total; } @@ -7531,6 +7750,7 @@ dump_on_rq_tasks(void) if (!on_rq || tc->processor != cpu) continue; + INDENT(5); dump_task_runq_entry(tc); tot++; } @@ -7543,16 +7763,8 @@ dump_on_rq_tasks(void) } static void -dump_CFS_runqueues(void) +cfs_rq_offset_init(void) { - int tot, cpu; - ulong runq, cfs_rq; - char *runqbuf, *cfs_rq_buf; - ulong tasks_timeline ATTRIBUTE_UNUSED; - struct task_context *tc; - struct rb_root *root; - struct syment *rq_sp, *init_sp; - if (!VALID_STRUCT(cfs_rq)) { STRUCT_SIZE_INIT(cfs_rq, "cfs_rq"); STRUCT_SIZE_INIT(rt_rq, "rt_rq"); @@ -7585,6 +7797,49 @@ dump_CFS_runqueues(void) "run_list"); MEMBER_OFFSET_INIT(rt_prio_array_queue, "rt_prio_array", "queue"); } +} + +static void +task_group_offset_init(void) +{ + if (!VALID_STRUCT(task_group)) { + STRUCT_SIZE_INIT(task_group, "task_group"); + MEMBER_OFFSET_INIT(rt_rq_rt_nr_running, "rt_rq", "rt_nr_running"); + MEMBER_OFFSET_INIT(cfs_rq_tg, "cfs_rq", "tg"); + MEMBER_OFFSET_INIT(rt_rq_tg, "rt_rq", "tg"); + MEMBER_OFFSET_INIT(rt_rq_highest_prio, "rt_rq", "highest_prio"); + MEMBER_OFFSET_INIT(task_group_css, "task_group", "css"); + MEMBER_OFFSET_INIT(cgroup_subsys_state_cgroup, + "cgroup_subsys_state", "cgroup"); + MEMBER_OFFSET_INIT(cgroup_dentry, "cgroup", "dentry"); + + MEMBER_OFFSET_INIT(task_group_siblings, "task_group", "siblings"); + MEMBER_OFFSET_INIT(task_group_children, "task_group", "children"); + + MEMBER_OFFSET_INIT(task_group_cfs_bandwidth, + "task_group", "cfs_bandwidth"); + MEMBER_OFFSET_INIT(cfs_rq_throttled, "cfs_rq", + "throttled"); + + MEMBER_OFFSET_INIT(task_group_rt_bandwidth, + "task_group", "rt_bandwidth"); + MEMBER_OFFSET_INIT(rt_rq_rt_throttled, "rt_rq", + "rt_throttled"); + } +} + +static void +dump_CFS_runqueues(void) +{ + int cpu; + ulong runq, cfs_rq; + char *runqbuf, *cfs_rq_buf; + ulong tasks_timeline ATTRIBUTE_UNUSED; + struct task_context *tc; + struct rb_root *root; + struct syment *rq_sp, *init_sp; + + cfs_rq_offset_init(); if (!(rq_sp = per_cpu_symbol_search("per_cpu__runqueues"))) error(FATAL, "per-cpu runqueues do not exist\n"); @@ -7635,18 +7890,14 @@ dump_CFS_runqueues(void) OFFSET(cfs_rq_tasks_timeline)); } - dump_RT_prio_array(0, runq + OFFSET(rq_rt) + OFFSET(rt_rq_active), - &runqbuf[OFFSET(rq_rt) + OFFSET(rt_rq_active)]); + dump_RT_prio_array(0, 0, runq + OFFSET(rq_rt) + OFFSET(rt_rq_active), + &runqbuf[OFFSET(rq_rt) + OFFSET(rt_rq_active)], cpu); fprintf(fp, " CFS RB_ROOT: %lx\n", (ulong)root); hq_open(); - tot = dump_tasks_in_cfs_rq(cfs_rq); + dump_tasks_in_cfs_rq(0, 0, cfs_rq, cpu); hq_close(); - if (!tot) { - INDENT(5); - fprintf(fp, "[no tasks queued]\n"); - } } FREEBUF(runqbuf); @@ -7655,7 +7906,106 @@ dump_CFS_runqueues(void) } static void -dump_RT_prio_array(int depth, ulong k_prio_array, char *u_prio_array) +print_group_header_rt(ulong rt_rq, void *t) +{ + int throttled; + + struct task_group_info *tgi = (struct task_group_info *)t; + + if (tgi->name) + fprintf(fp, " <%s>", tgi->name); + + if (VALID_MEMBER(task_group_rt_bandwidth)) { + readmem(rt_rq + OFFSET(rt_rq_rt_throttled), KVADDR, + &throttled, sizeof(int), "rt_rq rt_throttled", + FAULT_ON_ERROR); + if (throttled) + fprintf(fp, " (THROTTLED)"); + } +} + +static int +dump_tasks_in_lower_dequeued_rt_rq(int depth, ulong rt_rq, int cpu) +{ + int i, j, prio, tot, delta, nr_running; + char *rt_rq_buf; + ulong rt_rq_c, rt_rq_p, t, p, tmp1, tmp2; + + tot = 0; + for (i = 0; i < tgi_p; i++) { + delta = tgi_array[i].depth - depth; + if (delta > 1) + break; + + if (tgi_array[i].use == 0 || delta < 1) + continue; + + readmem(rt_rq + OFFSET(rt_rq_tg), KVADDR, &t, sizeof(ulong), + "rt_rq tg", FAULT_ON_ERROR); + readmem(tgi_array[i].task_group + OFFSET(task_group_parent), + KVADDR, &p, sizeof(ulong), "task_group parent", + FAULT_ON_ERROR); + if (t != p) + continue; + + readmem(tgi_array[i].task_group + OFFSET(task_group_rt_rq), + KVADDR, &rt_rq_c, sizeof(ulong), "task_group rt_rq", + FAULT_ON_ERROR); + readmem(rt_rq_c + cpu * sizeof(ulong), KVADDR, &rt_rq_p, + sizeof(ulong), "task_group rt_rq", FAULT_ON_ERROR); + if (rt_rq == rt_rq_p) + continue; + + tgi_array[i].use = 0; + + readmem(rt_rq_p + OFFSET(rt_rq_rt_nr_running), KVADDR, + &nr_running, sizeof(int), "rt_rq rt_nr_running", + FAULT_ON_ERROR); + if (nr_running == 0) { + tgi_queue[tgi_q++] = &tgi_array[i]; + tot += dump_tasks_in_lower_dequeued_rt_rq(depth + 1, + rt_rq_p, cpu); + continue; + } + + for (j = 0; j < tgi_q; j++) { + readmem(tgi_queue[j]->task_group + OFFSET(task_group_rt_rq), + KVADDR, &tmp1, sizeof(ulong), "task_group rt_rq", + FAULT_ON_ERROR); + readmem(tmp1 + cpu * sizeof(ulong), KVADDR, &tmp2, + sizeof(ulong), "task_group rt_rq", FAULT_ON_ERROR); + readmem(tmp2 + OFFSET(rt_rq_highest_prio), KVADDR, &prio, + sizeof(int), "rt_rq highest prio", FAULT_ON_ERROR); + + INDENT(-1 + 6 * tgi_queue[j]->depth); + fprintf(fp, "[%3d] ", prio); + fprintf(fp, "GROUP RT PRIO_ARRAY: %lx", + tmp2 + OFFSET(rt_rq_active)); + print_group_header_rt(rt_rq, tgi_queue[j]); + fprintf(fp, "\n"); + } + tgi_q = 0; + + rt_rq_buf = GETBUF(SIZE(rt_rq)); + readmem(rt_rq_p, KVADDR, rt_rq_buf, SIZE(rt_rq), "rt_rq", + FAULT_ON_ERROR); + prio = INT(rt_rq_buf + OFFSET(rt_rq_highest_prio)); + INDENT(5 + 6 * depth); + fprintf(fp, "[%3d] ", prio); + tot++; + dump_RT_prio_array(1, depth + 1, rt_rq_p + OFFSET(rt_rq_active), + &rt_rq_buf[OFFSET(rt_rq_active)], cpu); + FREEBUF(rt_rq_buf); + } + if (tgi_q > 0) + tgi_q--; + + return tot; +} + +static void +dump_RT_prio_array(int g_flag, int depth, ulong k_prio_array, + char *u_prio_array, int cpu) { int i, c, tot, cnt, qheads; ulong offset, kvaddr, uvaddr; @@ -7663,11 +8013,29 @@ dump_RT_prio_array(int depth, ulong k_prio_array, char *u_prio_array) struct list_data list_data, *ld; struct task_context *tc; ulong *tlist; - ulong my_q, task_addr; + ulong my_q, task_addr, rt_rq, tg; char *rt_rq_buf; - if (!depth) + rt_rq = k_prio_array - OFFSET(rt_rq_active); + if (!depth) { fprintf(fp, " RT PRIO_ARRAY: %lx\n", k_prio_array); + } else { + fprintf(fp, "GROUP RT PRIO_ARRAY: %lx", k_prio_array); + if (g_flag) { + readmem(rt_rq + OFFSET(rt_rq_tg), KVADDR, + &tg, sizeof(ulong), "rt_rq tg", + FAULT_ON_ERROR); + for (i = 0; i < tgi_p; i++) { + if (tgi_array[i].task_group == tg) + break; + } + if (i < tgi_p) { + tgi_array[i].use = 0; + print_group_header_rt(rt_rq, &tgi_array[i]); + } + } + fprintf(fp, "\n"); + } qheads = (i = ARRAY_LENGTH(rt_prio_array_queue)) ? i : get_array_length("rt_prio_array.queue", NULL, SIZE(list_head)); @@ -7702,28 +8070,30 @@ dump_RT_prio_array(int depth, ulong k_prio_array, char *u_prio_array) cnt = retrieve_list(tlist, cnt); for (c = 0; c < cnt; c++) { task_addr = tlist[c]; - if (VALID_MEMBER(sched_rt_entity_my_q)) { - readmem(tlist[c] + OFFSET(sched_rt_entity_my_q), - KVADDR, &my_q, sizeof(ulong), "my_q", - FAULT_ON_ERROR); - if (my_q) { - rt_rq_buf = GETBUF(SIZE(rt_rq)); - readmem(my_q, KVADDR, rt_rq_buf, - SIZE(rt_rq), "rt_rq", - FAULT_ON_ERROR); - - INDENT(5 + 6 * depth); - fprintf(fp, "[%3d] ", i); - fprintf(fp, "GROUP RT PRIO_ARRAY: %lx\n", - my_q + OFFSET(rt_rq_active)); - tot++; - dump_RT_prio_array(depth + 1, - my_q + OFFSET(rt_rq_active), - &rt_rq_buf[OFFSET(rt_rq_active)]); - continue; - } else - task_addr -= OFFSET(task_struct_rt); + if (INVALID_MEMBER(sched_rt_entity_my_q)) + goto is_task; + + readmem(tlist[c] + OFFSET(sched_rt_entity_my_q), + KVADDR, &my_q, sizeof(ulong), "my_q", + FAULT_ON_ERROR); + if (!my_q) { + task_addr -= OFFSET(task_struct_rt); + goto is_task; } + + rt_rq_buf = GETBUF(SIZE(rt_rq)); + readmem(my_q, KVADDR, rt_rq_buf, SIZE(rt_rq), + "rt_rq", FAULT_ON_ERROR); + + INDENT(5 + 6 * depth); + fprintf(fp, "[%3d] ", i); + tot++; + dump_RT_prio_array(g_flag, depth + 1, my_q + OFFSET(rt_rq_active), + &rt_rq_buf[OFFSET(rt_rq_active)], cpu); + FREEBUF(rt_rq_buf); + continue; + +is_task: if (!(tc = task_to_context(task_addr))) continue; @@ -7736,12 +8106,146 @@ dump_RT_prio_array(int depth, ulong k_prio_array, char *u_prio_array) FREEBUF(tlist); } + if (g_flag) + tot += dump_tasks_in_lower_dequeued_rt_rq(depth, rt_rq, cpu); + if (!tot) { - INDENT(5 + 9 * depth); + INDENT(5 + 6 * depth); fprintf(fp, "[no tasks queued]\n"); } } +static void +get_task_group_name(ulong group, char **group_name) +{ + ulong cgroup, dentry, name; + char *dentry_buf, *tmp; + int len; + + readmem(group + OFFSET(task_group_css) + OFFSET(cgroup_subsys_state_cgroup), + KVADDR, &cgroup, sizeof(ulong), + "task_group css cgroup", FAULT_ON_ERROR); + if (cgroup == 0) + return; + + readmem(cgroup + OFFSET(cgroup_dentry), KVADDR, &dentry, sizeof(ulong), + "cgroup dentry", FAULT_ON_ERROR); + if (dentry == 0) + return; + + dentry_buf = GETBUF(SIZE(dentry)); + readmem(dentry, KVADDR, dentry_buf, SIZE(dentry), + "dentry", FAULT_ON_ERROR); + len = UINT(dentry_buf + OFFSET(dentry_d_name) + OFFSET(qstr_len)); + tmp = GETBUF(len + 1); + name = ULONG(dentry_buf + OFFSET(dentry_d_name) + OFFSET(qstr_name)); + BZERO(group_name, len + 1); + readmem(name, KVADDR, tmp, len, "qstr name", FAULT_ON_ERROR); + *group_name = tmp; + FREEBUF(dentry_buf); +} + +static void +fill_task_group_info_array(int depth, ulong group, char *group_buf) +{ + ulong kvaddr, uvaddr, offset; + ulong list_head[2], next; + + if (depth) + tgi_array[tgi_p].use = 1; + tgi_array[tgi_p].depth = depth; + get_task_group_name(group, &tgi_array[tgi_p].name); + tgi_array[tgi_p++].task_group = group; + + offset = OFFSET(task_group_children); + kvaddr = group + offset; + uvaddr = (ulong)(group_buf + offset); + BCOPY((char *)uvaddr, (char *)&list_head[0], sizeof(ulong)*2); + + if ((list_head[0] == kvaddr) && (list_head[1] == kvaddr)) + return; + + next = list_head[0]; + while (next != kvaddr) { + group = next - OFFSET(task_group_siblings); + readmem(group, KVADDR, group_buf, SIZE(task_group), + "task_group", FAULT_ON_ERROR); + next = ULONG(group_buf + OFFSET(task_group_siblings) + + OFFSET(list_head_next)); + fill_task_group_info_array(depth + 1, group, group_buf); + } +} + +static void +dump_tasks_by_task_group(void) +{ + int cpu; + ulong root_task_group, cfs_rq, cfs_rq_p; + ulong rt_rq, rt_rq_p; + char *buf, *rt_rq_buf; + struct rb_root *root; + struct task_context *tc; + + cfs_rq_offset_init(); + task_group_offset_init(); + + root_task_group = 0; + if (symbol_exists("init_task_group")) + root_task_group = symbol_value("init_task_group"); + else if (symbol_exists("root_task_group")) + root_task_group = symbol_value("root_task_group"); + else + error(FATAL, "cannot determine root task_group\n"); + + tgi_array = (struct task_group_info *) + GETBUF(sizeof(struct task_group_info) * MAX_GROUP_NUM); + buf = GETBUF(SIZE(task_group)); + readmem(root_task_group, KVADDR, buf, SIZE(task_group), + "task_group", FAULT_ON_ERROR); + fill_task_group_info_array(0, root_task_group, buf); + sort_task_group_info_array(tgi_array, tgi_p); + if (CRASHDEBUG(1)) + print_task_group_info_array(tgi_array, tgi_p); + + get_active_set(); + rt_rq_buf = GETBUF(SIZE(rt_rq)); + + for (cpu = 0; cpu < kt->cpus; cpu++) { + fprintf(fp, "%sCPU %d\n", cpu ? "\n" : "", cpu); + fprintf(fp, " CURRENT: "); + if ((tc = task_to_context(tt->active_set[cpu]))) + fprintf(fp, "PID: %-5ld TASK: %lx COMMAND: \"%s\"\n", + tc->pid, tc->task, tc->comm); + else + fprintf(fp, "%lx\n", tt->active_set[cpu]); + + readmem(root_task_group, KVADDR, buf, SIZE(task_group), + "task_group", FAULT_ON_ERROR); + rt_rq = ULONG(buf + OFFSET(task_group_rt_rq)); + readmem(rt_rq + cpu * sizeof(ulong), KVADDR, &rt_rq_p, + sizeof(ulong), "task_group rt_rq", FAULT_ON_ERROR); + readmem(rt_rq_p, KVADDR, rt_rq_buf, SIZE(rt_rq), + "rt_rq", FAULT_ON_ERROR); + dump_RT_prio_array(1, 0, rt_rq_p + OFFSET(rt_rq_active), + &rt_rq_buf[OFFSET(rt_rq_active)], cpu); + reuse_task_group_info_array(tgi_array, tgi_p); + tgi_q = 0; + + cfs_rq = ULONG(buf + OFFSET(task_group_cfs_rq)); + readmem(cfs_rq + cpu * sizeof(ulong), KVADDR, &cfs_rq_p, + sizeof(ulong), "task_group cfs_rq", FAULT_ON_ERROR); + root = (struct rb_root *)(cfs_rq + OFFSET(cfs_rq_tasks_timeline)); + fprintf(fp, " CFS RB_ROOT: %lx\n", (ulong)root); + dump_tasks_in_cfs_rq(1, 0, cfs_rq_p, cpu); + reuse_task_group_info_array(tgi_array, tgi_p); + tgi_q = 0; + } + + FREEBUF(rt_rq_buf); + FREEBUF(buf); + free_task_group_info_array(tgi_array, &tgi_p); +} + #undef _NSIG #define _NSIG 64 #define _NSIG_BPW machdep->bits -- 1.7.1
>From bdf032a57344b5592ba19a6c0c304ba7069c4b17 Mon Sep 17 00:00:00 2001 From: Zhang Yanfei <zhangyanfei@xxxxxxxxxxxxxx> Date: Tue, 6 Nov 2012 16:46:36 +0800 Subject: [PATCH] add help info for runq -g Signed-off-by: Zhang Yanfei <zhangyanfei@xxxxxxxxxxxxxx> --- help.c | 42 ++++++++++++++++++++++++++++++++++++++++-- 1 files changed, 40 insertions(+), 2 deletions(-) diff --git a/help.c b/help.c index 14bf533..9a63d04 100755 --- a/help.c +++ b/help.c @@ -2201,7 +2201,7 @@ NULL char *help_runq[] = { "runq", "run queue", -"[-t]", +"[-t] [-g]", " With no argument, this command displays the tasks on the run queues", " of each cpu.", " ", @@ -2209,7 +2209,9 @@ char *help_runq[] = { " rq.clock, rq.most_recent_timestamp or rq.timestamp_last_tick value,", " whichever applies; following each cpu timestamp is the last_run or ", " timestamp value of the active task on that cpu, whichever applies, ", -" along with the task identification.", +" along with the task identification.", +" -g Display tasks with group information of each cpu hierarchically. Note", +" that tasks in throttled cfs_rq/rt_rq are also displayed.", "\nEXAMPLES", " Display the tasks on an O(1) scheduler run queue:\n", " %s> runq", @@ -2259,6 +2261,42 @@ char *help_runq[] = { " 2680986785772 PID: 28227 TASK: ffff8800787780c0 COMMAND: \"loop\"", " CPU 3: 2680990954469", " 2680986059540 PID: 28226 TASK: ffff880078778b00 COMMAND: \"loop\"", +" ", +" Display tasks with group information hierarchically:\n", +" %s> runq -g ", +" CPU 0", +" CURRENT: PID: 14734 TASK: ffff88010626f500 COMMAND: \"sh\"", +" RT PRIO_ARRAY: ffff880028216808", +" [ 0] GROUP RT PRIO_ARRAY: ffff880139fc9800 <test1> (THROTTLED)", +" [ 0] PID: 14750 TASK: ffff88013a4dd540 COMMAND: \"rtloop99\"", +" [ 1] PID: 14748 TASK: ffff88013bbca040 COMMAND: \"rtloop98\"", +" [ 1] GROUP RT PRIO_ARRAY: ffff880089029000 <test11>", +" [ 1] PID: 14752 TASK: ffff880088abf500 COMMAND: \"rtloop98\"", +" [ 54] PID: 14749 TASK: ffff880037a4e080 COMMAND: \"rtloop45\"", +" [ 98] PID: 14746 TASK: ffff88012678c080 COMMAND: \"rtloop1\"", +" CFS RB_ROOT: ffff88013fc23050", +" [120] PID: 14740 TASK: ffff88013b1e6080 COMMAND: \"sh\"", +" [120] PID: 14738 TASK: ffff88012678d540 COMMAND: \"sh\"", +" GROUP CFS RB_ROOT: ffff8800897af430 <test2> (THROTTLED)", +" [120] PID: 14732 TASK: ffff88013bbcb500 COMMAND: \"sh\"", +" [120] PID: 14728 TASK: ffff8800b3496080 COMMAND: \"sh\"", +" [120] PID: 14730 TASK: ffff880037833540 COMMAND: \"sh\"", +" GROUP CFS RB_ROOT: ffff880037943e30 <test1> (THROTTLED)", +" [120] PID: 14726 TASK: ffff880138d42aa0 COMMAND: \"sh\"", +" ", +" CPU 1", +" CURRENT: PID: 3269 TASK: ffff88013b0fa040 COMMAND: \"bash\"", +" RT PRIO_ARRAY: ffff880028296808", +" [ 0] GROUP RT PRIO_ARRAY: ffff88008a1f5000 <test1> (THROTTLED)", +" [ 0] GROUP RT PRIO_ARRAY: ffff880121774800 <test11>", +" [ 0] PID: 14753 TASK: ffff88013bbbaae0 COMMAND: \"rtloop99\"", +" [ 98] PID: 14745 TASK: ffff880126763500 COMMAND: \"rtloop1\"", +" [ 98] PID: 14747 TASK: ffff88013b1e6ae0 COMMAND: \"rtloop1\"", +" CFS RB_ROOT: ffff88013fc23050", +" GROUP CFS RB_ROOT: ffff8800896eac30 <test1>", +" [120] PID: 14724 TASK: ffff880139632080 COMMAND: \"sh\"", +" [120] PID: 14742 TASK: ffff880126762aa0 COMMAND: \"sh\"", +" [120] PID: 14736 TASK: ffff88010626e040 COMMAND: \"sh\"", NULL }; -- 1.7.1
-- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/crash-utility