Hi Daniel, On Thu, Jun 16, 2022 at 1:45 AM Daniel Bristot de Oliveira <bristot@xxxxxxxxxx> wrote: > > Over the last years, I've been exploring the possibility of > verifying the Linux kernel behavior using Runtime Verification. > > Runtime Verification (RV) is a lightweight (yet rigorous) method that > complements classical exhaustive verification techniques (such as model > checking and theorem proving) with a more practical approach for complex > systems. > > Instead of relying on a fine-grained model of a system (e.g., a > re-implementation a instruction level), RV works by analyzing the trace of the > system's actual execution, comparing it against a formal specification of > the system behavior. > > The usage of deterministic automaton for RV is a well-established > approach. In the specific case of the Linux kernel, you can check how > to model complex behavior of the Linux kernel with this paper: > > DE OLIVEIRA, Daniel Bristot; CUCINOTTA, Tommaso; DE OLIVEIRA, Romulo Silva. > *Efficient formal verification for the Linux kernel.* In: International > Conference on Software Engineering and Formal Methods. Springer, Cham, 2019. > p. 315-332. > > And how efficient is this approach here: > > DE OLIVEIRA, Daniel B.; DE OLIVEIRA, Romulo S.; CUCINOTTA, Tommaso. *A thread > synchronization model for the PREEMPT_RT Linux kernel.* Journal of Systems > Architecture, 2020, 107: 101729. > > tlrd: it is possible to model complex behaviors in a modular way, with > an acceptable overhead (even for production systems). See this > presentation at 2019's ELCE: https://www.youtube.com/watch?v=BfTuEHafNgg > > Here I am proposing a more practical approach for the usage of deterministic > automata for runtime verification, and it includes: > > - An interface for controlling the verification; > - A tool and set of headers that enables the automatic code > generation of the RV monitor (Monitor Synthesis); > - Sample monitors to evaluate the interface; > - A sample monitor developed in the context of the Elisa Project > demonstrating how to use RV in the context of safety-critical > systems. > > Given that RV is a tracing consumer, the code is being placed inside the > tracing subsystem (Steven and I have been talking about it for a while). This is interesting work! I applied the series on top of commit 78ca55889a549a9a194c6ec666836329b774ab6d in upstream. Then, I got some compile/link error for CONFIG_RV_MON_WIP and CONFIG_RV_MON_SAFE_WTD. I was able to compile the kernel with these two configs disabled. However, I hit the some issue with monitors/wwnr/enabled : [root@eth50-1 ~]# cd /sys/kernel/debug/tracing/rv/ [root@eth50-1 rv]# cat available_monitors wwnr [root@eth50-1 rv]# echo wwnr > enabled_monitors [root@eth50-1 rv]# cd monitors/ [root@eth50-1 monitors]# cd wwnr/ [root@eth50-1 wwnr]# ls desc enable reactors [root@eth50-1 wwnr]# cat enable 1 [root@eth50-1 wwnr]# echo 0 > enable <<< hangs The last echo command hangs forever on a qemu vm. I haven't figured out why this happens though. I also have a more general question: can we do RV with BPF and simplify the work? AFAICT, the idea of RV is to maintain a state machine based on events. If something unexpected happens, call the reactor. IIUC, BPF has most of these building blocks ready for use. With BPF, we can ship many RV monitors without much kernel changes. Here is my toy wwnr in bpftrace. The reactor is "print to console". It runs on most systems with BPF and tracepoint enabled. I probably missed some events, as a result, the script triggers the "reactor" a lot. =============== 8< ====================== [root@ ~]# cat wwnr.bt /* * task_state[pid] * not_running = 1 * running = 2 */ tracepoint:sched:sched_switch { if (args->prev_state == 0x0001 /* TASK_INTERRUPTIBLE */) { /* after first suspension */ @task_state[args->prev_pid] = 1; } else { if (@task_state[args->prev_pid] == 1) { printf("Something wrong, call reactor\n"); } @task_state[args->prev_pid] = 1; } @task_state[args->next_pid] = 2; } tracepoint:sched:sched_wakeup { if (@task_state[args->pid] == 2) { printf("Something wrong, call reactor\n"); } @task_state[args->pid] = 2; } [root@ ~]# bpftrace wwnr.bt <<<< some print >>>> =============== 8< ====================== Does this (BPF for RV) make any sense? Thanks, Song