----- Original Message ----- > Initial version of a crash module which can be used to show which cgroups > is the currently active process member of. > --- > Hello this is a simple crash extension that I hacked up over the weekend, in my > case when I look at kernel crash dump I want to quickly understand which cgroup > is the current process member of. Currently it uses the process from the current > context but this might change in the future. Here is an example output: > > crash> show_cgroups > subsys: cpuset cgroup: c6666 > subsys: cpu cgroup: c6666 > subsys: cpuacct cgroup: c6666 > subsys: io cgroup: c6666 > subsys: memory cgroup: c6666 > subsys: devices cgroup: c6666 > subsys: freezer cgroup: c6666 > subsys: perf_event cgroup: c6666 > subsys: pids cgroup: c6666 > > I have tested this on 4.6-rc2 with and without cgroup support enabled. > > I'm just sending this to get an initial idea whether I have used crash's > facilities correctly and canvas for future ideas. I'm aware there are already 2 > cgroup modules but when I tried running either they complained of no cgroup > support or the command did nothing. In any case provided that the code is ok I > guess this can be used as a good example of how to traverse structures with crash > > TODO: > * Make the command understand either task_struct pointer or pid being passed. > * Add support for pre-3.15 kernels (the cgroup name struct changed to kernfs at that point) > * Whatever people think might be useful Hello Nikolay, I have a few comments and suggestions: +#include "defs.h" + +#define CGROUP_PATH_MAX What is the purpose of CGROUP_PATH_MAX? +void proccgroup_init(void); +void proccgroup_fini(void); + ... [ cut ] ... + +void __attribute__((constructor)) +echo_init(void) +{ + register_extension(command_table); +} + +void __attribute__((destructor)) +echo_fini(void) { } + You've got forward declarations of proccgroup_init() and proccgroup_fini(), but the two functions don't exist. The "echo_xxx()" functions in the echo.c sample module are meant to be replaced by names reflecting your module. If I run this module on a 3.10 kernel, I get this: crash> extend proccgroup.so ./extensions/proccgroup.so: shared object loaded crash> show_cgroups Unsupported kernel versionUnsupported kernel versionUnsupported kernel versionUnsupported kernel versionUnsupported kernel versionUnsupported kernel versionUnsupported kernel versionUnsupported kernel versionUnsupported kernel versionUnsupported kernel versionshow_cgroups: invalid kernel virtual address: 0 type: "cgroup_subsys_state->cgroup" Unsupported kernel version crash> because of this: + /* Handle the 2 cases of cgroup_name and the kernfs one */ + if (MEMBER_EXISTS("cgroup", "kn")) { + get_kn_cgroup_name(cgroup, subsys_ptr); + } else if (MEMBER_EXISTS("cgroup", "name")) { + fprintf(fp, "Unsupported kernel version"); + } You should replace the "fprintf(fp, "Unsupported kernel version")" error message above with an "error(FATAL, ...)" construct (and add a "\n" to the end), so that the command will fail immediately and return to the "crash> " prompt. Alternatively, you can have your module purposely fail to load if any required structure members do not exist. Typically modules will have their xxxx_init() constructor function do some pre-checks of the kernel being analyzed, and if the command will not be able to run correctly, return before calling register_extension(): void __attribute__((constructor)) xxxx_init(void) { if (!MEMBER_EXISTS(...) || !MEMBER_EXISTS(...)) { fprintf(fp, "some error message\n"); return; } register_extension(command_table); } I note that all of your readmem() calls use RETURN_ON_ERROR, such as these examples: +static void get_cgroup_name(ulong cgroup, char *buf, size_t buflen) +{ + + ulong kernfs_node; + ulong cgroup_name_ptr; + ulong kernfs_parent; + + /* Get cgroup->kn */ + readmem(cgroup + MEMBER_OFFSET("cgroup", "kn"), KVADDR, &kernfs_node, sizeof(void *), + "cgroup->kn", RETURN_ON_ERROR); + + readmem(kernfs_node + MEMBER_OFFSET("kernfs_node", "parent"), KVADDR, &kernfs_parent, sizeof(void *), + "kernfs_node->parent", RETURN_ON_ERROR); + + if (kernfs_parent == 0) { + sprintf(buf, "/"); + return; + } + + /* Get kn->name */ + readmem(kernfs_node + MEMBER_OFFSET("kernfs_node", "name"), KVADDR, &cgroup_name_ptr, sizeof(void *), + "kernfs_node->name", RETURN_ON_ERROR); + + read_string(cgroup_name_ptr, buf, buflen-1); +} Unless you can do something useful in the case of a readmem() error, it's best to use FAULT_ON_ERROR. You will get an error message automatically generated by readmem(), but similar to "error(FATAL, ...)" it will return immediately to the "crash> " prompt. In your module above, you would just get a stream of confusing error messages. You might consider changing the name of the command to something that flows off the fingertips a bit easier: +static struct command_table_entry command_table[] = { + { "show_cgroups", show_proc_cgroups, help_proc_cgroups, 0}, + { NULL }, +}; Maybe something like "cgrp", or "showcg", or anything short but to the point. Thanks, Dave -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/crash-utility