On 04/11/2016 05:06 PM, Dave Anderson wrote: > > > ----- Original Message ----- >> Initial version of a crash module which can be used to show which cgroups >> is the currently active process member of. >> --- >> Hello this is a simple crash extension that I hacked up over the weekend, in my >> case when I look at kernel crash dump I want to quickly understand which cgroup >> is the current process member of. Currently it uses the process from the current >> context but this might change in the future. Here is an example output: >> >> crash> show_cgroups >> subsys: cpuset cgroup: c6666 >> subsys: cpu cgroup: c6666 >> subsys: cpuacct cgroup: c6666 >> subsys: io cgroup: c6666 >> subsys: memory cgroup: c6666 >> subsys: devices cgroup: c6666 >> subsys: freezer cgroup: c6666 >> subsys: perf_event cgroup: c6666 >> subsys: pids cgroup: c6666 >> >> I have tested this on 4.6-rc2 with and without cgroup support enabled. >> >> I'm just sending this to get an initial idea whether I have used crash's >> facilities correctly and canvas for future ideas. I'm aware there are already 2 >> cgroup modules but when I tried running either they complained of no cgroup >> support or the command did nothing. In any case provided that the code is ok I >> guess this can be used as a good example of how to traverse structures with crash >> >> TODO: >> * Make the command understand either task_struct pointer or pid being passed. >> * Add support for pre-3.15 kernels (the cgroup name struct changed to kernfs at that point) >> * Whatever people think might be useful > > > Hello Nikolay, > > I have a few comments and suggestions: > > +#include "defs.h" > + > +#define CGROUP_PATH_MAX > > What is the purpose of CGROUP_PATH_MAX? I was thinking of using this define to limit the size of various buffers, but I saw that for most cases BUFSIZE works. This will likely be removed. > > +void proccgroup_init(void); > +void proccgroup_fini(void); > + > ... [ cut ] ... > + > +void __attribute__((constructor)) > +echo_init(void) > +{ > + register_extension(command_table); > +} > + > +void __attribute__((destructor)) > +echo_fini(void) { } > + > > You've got forward declarations of proccgroup_init() and proccgroup_fini(), > but the two functions don't exist. The "echo_xxx()" functions in the > echo.c sample module are meant to be replaced by names reflecting your > module. I will rename the echo* function, I guess I forgot doing that. Actually are forward declarations for the init/destroy functions really needed? I guess not. > > If I run this module on a 3.10 kernel, I get this: > > crash> extend proccgroup.so > ./extensions/proccgroup.so: shared object loaded > crash> show_cgroups > Unsupported kernel versionUnsupported kernel versionUnsupported kernel versionUnsupported kernel versionUnsupported kernel versionUnsupported kernel versionUnsupported kernel versionUnsupported kernel versionUnsupported kernel versionUnsupported kernel versionshow_cgroups: invalid kernel virtual address: 0 type: "cgroup_subsys_state->cgroup" > Unsupported kernel version > crash> > > because of this: > > + /* Handle the 2 cases of cgroup_name and the kernfs one */ > + if (MEMBER_EXISTS("cgroup", "kn")) { > + get_kn_cgroup_name(cgroup, subsys_ptr); > + } else if (MEMBER_EXISTS("cgroup", "name")) { > + fprintf(fp, "Unsupported kernel version"); > + } > > You should replace the "fprintf(fp, "Unsupported kernel version")" error message > above with an "error(FATAL, ...)" construct (and add a "\n" to the end), so that > the command will fail immediately and return to the "crash> " prompt. I've since added supported for pre-3.15 kernels. So this will likely work, but I did see the output you showed. Clearly the failure case wasn't handled properly. > > Alternatively, you can have your module purposely fail to load if any required > structure members do not exist. Typically modules will have their xxxx_init() > constructor function do some pre-checks of the kernel being analyzed, and if the > command will not be able to run correctly, return before calling register_extension(): > > void __attribute__((constructor)) > xxxx_init(void) > { > if (!MEMBER_EXISTS(...) || !MEMBER_EXISTS(...)) { > fprintf(fp, "some error message\n"); > return; > } > > register_extension(command_table); > } > > I note that all of your readmem() calls use RETURN_ON_ERROR, such as these examples: > > +static void get_cgroup_name(ulong cgroup, char *buf, size_t buflen) > +{ > + > + ulong kernfs_node; > + ulong cgroup_name_ptr; > + ulong kernfs_parent; > + > + /* Get cgroup->kn */ > + readmem(cgroup + MEMBER_OFFSET("cgroup", "kn"), KVADDR, &kernfs_node, sizeof(void *), > + "cgroup->kn", RETURN_ON_ERROR); > + > + readmem(kernfs_node + MEMBER_OFFSET("kernfs_node", "parent"), KVADDR, &kernfs_parent, sizeof(void *), > + "kernfs_node->parent", RETURN_ON_ERROR); > + > + if (kernfs_parent == 0) { > + sprintf(buf, "/"); > + return; > + } > + > + /* Get kn->name */ > + readmem(kernfs_node + MEMBER_OFFSET("kernfs_node", "name"), KVADDR, &cgroup_name_ptr, sizeof(void *), > + "kernfs_node->name", RETURN_ON_ERROR); > + > + read_string(cgroup_name_ptr, buf, buflen-1); > +} > > Unless you can do something useful in the case of a readmem() error, it's best to > use FAULT_ON_ERROR. You will get an error message automatically generated by readmem(), > but similar to "error(FATAL, ...)" it will return immediately to the "crash> " prompt. > In your module above, you would just get a stream of confusing error messages. I thought return_on_error essentially works with fault_on_error semantics. Clearly I've misunderstood that so will fix it. > > You might consider changing the name of the command to something that > flows off the fingertips a bit easier: > > +static struct command_table_entry command_table[] = { > + { "show_cgroups", show_proc_cgroups, help_proc_cgroups, 0}, > + { NULL }, > +}; > > Maybe something like "cgrp", or "showcg", or anything short but to > the point. Yes, I'm not very happy with the current name so will likely change that. Thanks for taking the time to review it. > > Thanks, > Dave -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/crash-utility