On 1/19/2023 3:10 PM, KP Singh wrote: > # Background > > LSM hooks (callbacks) are currently invoked as indirect function calls. These > callbacks are registered into a linked list at boot time as the order of the > LSMs can be configured on the kernel command line with the "lsm=" command line > parameter. > > Indirect function calls have a high overhead due to retpoline mitigation for > various speculative execution attacks. > > Retpolines remain relevant even with newer generation CPUs as recently > discovered speculative attacks, like Spectre BHB need Retpolines to mitigate > against branch history injection and still need to be used in combination with > newer mitigation features like eIBRS. > > This overhead is especially significant for the "bpf" LSM which allows the user > to implement LSM functionality with eBPF program. In order to facilitate this > the "bpf" LSM provides a default callback for all LSM hooks. When enabled, > the "bpf" LSM incurs an unnecessary / avoidable indirect call. This is > especially bad in OS hot paths (e.g. in the networking stack). > This overhead prevents the adoption of bpf LSM on performance critical > systems, and also, in general, slows down all LSMs. > > Since we know the address of the enabled LSM callbacks at compile time and only > the order is determined at boot time, No quite true. A system with Smack and AppArmor compiled in will only be allowed to use one or the other. > the LSM framework can allocate static > calls for each of the possible LSM callbacks and these calls can be updated once > the order is determined at boot. True if you also provide for the single "major" LSM restriction. > This series is a respin of the RFC proposed by Paul Renauld (renauld@xxxxxxxxxx) > and Brendan Jackman (jackmanb@xxxxxxxxxx) [1] > > # Performance improvement > > With this patch-set some syscalls with lots of LSM hooks in their path > benefitted at an average of ~3%. Here are the results of the relevant Unixbench > system benchmarks with BPF LSM and a major LSM (in this case apparmor) enabled > with and without the series. > > Benchmark Delta(%): (+ is better) > =============================================================================== > Execl Throughput +2.9015 > File Write 1024 bufsize 2000 maxblocks +5.4196 > Pipe Throughput +7.7434 > Pipe-based Context Switching +3.5118 > Process Creation +0.3552 > Shell Scripts (1 concurrent) +1.7106 > System Call Overhead +3.0067 > System Benchmarks Index Score (Partial Only): +3.1809 How about socket creation and packet delivery impact? You'll need to use either SELinux or Smack to get those numbers. > In the best case, some syscalls like eventfd_create benefitted to about ~10%. > The full analysis can be viewed at https://kpsingh.ch/lsm-perf > > [1] https://lore.kernel.org/linux-security-module/20200820164753.3256899-1-jackmanb@xxxxxxxxxxxx/ > > KP Singh (4): > kernel: Add helper macros for loop unrolling > security: Generate a header with the count of enabled LSMs > security: Replace indirect LSM hook calls with static calls > bpf: Only enable BPF LSM hooks when an LSM program is attached > > include/linux/bpf.h | 1 + > include/linux/bpf_lsm.h | 1 + > include/linux/lsm_hooks.h | 94 +++++++++++-- > include/linux/unroll.h | 35 +++++ > kernel/bpf/trampoline.c | 29 ++++- > scripts/Makefile | 1 + > scripts/security/.gitignore | 1 + > scripts/security/Makefile | 4 + > scripts/security/gen_lsm_count.c | 57 ++++++++ > security/Makefile | 11 ++ > security/bpf/hooks.c | 26 +++- > security/security.c | 217 ++++++++++++++++++++----------- > 12 files changed, 386 insertions(+), 91 deletions(-) > create mode 100644 include/linux/unroll.h > create mode 100644 scripts/security/.gitignore > create mode 100644 scripts/security/Makefile > create mode 100644 scripts/security/gen_lsm_count.c >