Dear community, I have created a kernel module that adds probes to do_execve() and do_exit() syscalls (code by the end of this email). It is running on a custom kernel-based system, version 3.18.31. The goal of this module is to see if I can capture several information from any process that is about to start, or that is about to leave userspace. I have tested the following scenarios: - app inits - app finishes its execution gracefully - app is killed - app crashes The first three cases I can retrieve information from the process, but on the last case, I am having an unexpected Kernel Oops. More specifically, I am having trouble on retrieving command-line arguments from a process, and seems to be due to some unusual race condition. To ease things, I have simplified the original source code and focused on the command-line part. It can be noticed that “getCommandLine()” function is not being shown here, and the reason is because is a copy of get_cmdline() method from mm/util.c (https://elixir.bootlin.com/linux/latest/source/mm/util.c#L855). This version of get_cmdline() is using synchronization mechanisms (in my case, I have implemented it with semaphores instead of spinlocks), which causes the Kernel to crash: ... BUG: scheduling while atomic: mysegfaultapp/6037/0x00000002 Modules linked in: ... CPU: 0 PID: 9313 Comm: mysegfaultapp Tainted: P W O 3.18.31 #2 [<c0014024>] (unwind_backtrace) from [<c00119f0>] (show_stack+0x10/0x14) [<c00119f0>] (show_stack) from [<c0039830>] (__schedule_bug+0x44/0x60) [<c0039830>] (__schedule_bug) from [<c0838040>] (__schedule+0x68/0x470) [<c0838040>] (__schedule) from [<c083a864>] (rwsem_down_read_failed+0x104/0x130) [<c083a864>] (rwsem_down_read_failed) from [<bf000918>] (getCommandLine.constprop.0+0x44/0x160 [mymodule]) [<bf000918>] (getCommandLine.constprop.0 [mymodule]) from [<bf000644>] (doExitHandler+0x1dc/0x25c [mymodule]) [<bf000644>] (doExitHandler [mymodule]) from [<c0021850>] (SyS_exit_group+0x0/0x10) [<c0021850>] (SyS_exit_group) from [<00000009>] (0x9) Unable to handle kernel paging request at virtual address fffffffe pgd = dbc20000 [fffffffe] *pgd=9f3f8821, *pte=00000000, *ppte=00000000 Internal error: Oops: 80000007 [#1] PREEMPT ARM ... But if I use an implementation without synchronization mechanisms (which is the one that matches my kernel version - https://elixir.bootlin.com/linux/v3.18.31/source/mm/util.c#L355), once a running app causes segmentation fault and crashes, I am not able to report its command-line, but system remains running (for reference, this app is a dummy app that causes a segfault on purpose, here called “mysegfaultapp”). Due to those situations, I have a few questions that I hope the community can give me some directions on where to look further and understand: 1) Is it possible to retrieve the command-line arguments from a userspace process that crashed? 2) How can I inspect the reason for this crash on rwsem_down_read_failed? 3) If I go for the v.3.18.31 version that doesn’t use synchronization structures (semaphores or spinlocks), what are the risks? Please let me know if you need further information, or if you have any questions. Thanks in advance, Cesar. ------------------------------------------------------------------------------------------------------------------------------------------- static struct kretprobe initProcess; static struct jprobe exitProcess; static void doExitHandler(long code) { char commandLine[200]; memset(commandLine, 0, sizeof(commandLine)); if (getCommandLine(current, commandLine, sizeof(commandLine)) <= 0) { strcpy(commandLine, "ERROR"); } printk(KERN_INFO "doExitHandler %s\n", commandLine); jprobe_return(); } static int doExecHandler(struct kretprobe_instance *pMetadata, struct pt_regs *pRegs) { char commandLine[200]; memset(commandLine, 0, sizeof(commandLine)); if (getCommandLine(current, commandLine, sizeof(commandLine)) <= 0) { strcpy(commandLine, "ERROR"); } printk(KERN_INFO "doExecHandler %s\n", commandLine); return 0; } static int myInit(void) { int retval; initProcess.kp.symbol_name = "do_execve"; initProcess.handler = doExecHandler; retval = register_kretprobe(&initProcess); exitProcess.kp.symbol_name = "do_exit"; exitProcess.entry = JPROBE_ENTRY(doExitHandler); retval = register_jprobe(&exitProcess); return retval; } static void myExit(void) { unregister_kretprobe(&initProcess); unregister_jprobe(&exitProcess); } module_init(myInit); module_exit(myExit); _______________________________________________ Kernelnewbies mailing list Kernelnewbies@xxxxxxxxxxxxxxxxx https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies