Hi, Here's a preview of what I'm planning to discuss at the LPC toolchains microconference. Feel free to start the discussion early :-) This is a proposal for some new minor GCC/Clang features which would help objtool greatly. Background ---------- Objtool is a kernel-specific tool which reverse engineers the control flow graph (CFG) of compiled objects. It then performs various validations, annotations, and modifications, mostly with the goal of improving robustness and security of the kernel. Objtool features which use the CFG include include: validation/generation of unwinding metadata; validation of Intel SMAP rules; and validation of kernel "noinstr" rules (preventing compiler instrumentation in certain critical sections). In general it's not feasible for the traditional toolchain to do any of this work, because the kernel has a lot of "blind spots" which the toolchain doesn't have visibility to, notably asm and inline asm. Manual .cfi annotations are very difficult to maintain and even more difficult to ensure correctness. Also, due to kernel live patching, the kernel relies on 100% correctness of unwinding metadata, whereas the toolchain treats it as a best effort. Challenges ---------- Reverse engineering the control flow graph is mostly quite straightforward, with two notable exceptions: 1) Jump tables (e.g., switch statements): Depending on the architecture, it's somewhere between difficult and impossible to reliabily identify which indirect jumps correspond to jump tables, and what are their corresponding intra-function jump destinations. 2) Noreturn functions: There's no reliable way to determine which functions are designated by the compiler to be noreturn (either explictly via function attribute, or implicitly via a static function which is a wrapper around a noreturn function.) This information is needed because the code after the call to such a function is optimized out as unreachable and objtool has no way of knowing that. Proposal -------- Add the following new compiler flags which create non-allocatable ELF sections which "annotate" control flow: (Note this is purely hypothetical, intended for starting a discussion. I'm not a compiler person and I haven't written any compiler code.) 1) -fannotate-jump-table Create an .annotate.jump_table section which is an array of the following variable-length structure: struct annotate_jump_table { void *indirect_jmp; long num_targets; void *targets[]; }; For example, given the following switch statement code: .Lswitch_jmp: // %rax is .Lcase_1 or .Lcase_2 jmp %rax .Lcase_1: ... .Lcase_2: ... Add the following code: .pushsection .annotate.jump_table // indirect JMP address .quad .Lswitch_jmp // num jump targets .quad 2 // indirect JMP target addresses .quad .Lcase_1 .quad .Lcase_2 .popsection 2) -fannotate-noreturn Create an .annotate.noreturn section which is an array of pointers to noreturn functions (both explicit/implicit and defined/undefined). For example, given the following three noreturn functions: // explicit noreturn: __attribute__((__noreturn__)) void func1(void) { exit(1); } // explicit noreturn (extern): extern __attribute__((__noreturn__)) void func2(void); // implicit noreturn: static void func3(void) { // call noreturn function func2(); } Add the following code: .pushsection .annotate.noreturn .quad func1 .quad func2 .quad func3 .popsection Alternatives ------------ Another idea which has been floated in the past is for objtool to read DWARF (or .eh_frame) to help it figure out the control flow. That hasn't been tried yet, but would be considerably more difficult and fragile IMO. -- Josh