Re: [PATCH v5 0/5] Reduce overhead of LSMs with static calls

Paolo Abeni <pabeni@xxxxxxxxxx> · Mon, 02 Oct 2023 15:27:36 +0200

On Mon, 2023-10-02 at 13:09 +0200, KP Singh wrote:
> On Mon, Oct 2, 2023 at 1:06 PM Paolo Abeni <pabeni@xxxxxxxxxx> wrote:
> > On Thu, 2023-09-28 at 22:24 +0200, KP Singh wrote:
> > > # Background
> > > 
> > > LSM hooks (callbacks) are currently invoked as indirect function calls. These
> > > callbacks are registered into a linked list at boot time as the order of the
> > > LSMs can be configured on the kernel command line with the "lsm=" command line
> > > parameter.
> > > 
> > > Indirect function calls have a high overhead due to retpoline mitigation for
> > > various speculative execution attacks.
> > > 
> > > Retpolines remain relevant even with newer generation CPUs as recently
> > > discovered speculative attacks, like Spectre BHB need Retpolines to mitigate
> > > against branch history injection and still need to be used in combination with
> > > newer mitigation features like eIBRS.
> > > 
> > > This overhead is especially significant for the "bpf" LSM which allows the user
> > > to implement LSM functionality with eBPF program. In order to facilitate this
> > > the "bpf" LSM provides a default callback for all LSM hooks. When enabled,
> > > the "bpf" LSM incurs an unnecessary / avoidable indirect call. This is
> > > especially bad in OS hot paths (e.g. in the networking stack).
> > > This overhead prevents the adoption of bpf LSM on performance critical
> > > systems, and also, in general, slows down all LSMs.
> > > 
> > > Since we know the address of the enabled LSM callbacks at compile time and only
> > > the order is determined at boot time, the LSM framework can allocate static
> > > calls for each of the possible LSM callbacks and these calls can be updated once
> > > the order is determined at boot.
> > > 
> > > This series is a respin of the RFC proposed by Paul Renauld (renauld@xxxxxxxxxx)
> > > and Brendan Jackman (jackmanb@xxxxxxxxxx) [1]
> > > 
> > > # Performance improvement
> > > 
> > > With this patch-set some syscalls with lots of LSM hooks in their path
> > > benefitted at an average of ~3% and I/O and Pipe based system calls benefitting
> > > the most.
> > > 
> > > Here are the results of the relevant Unixbench system benchmarks with BPF LSM
> > > and SELinux enabled with default policies enabled with and without these
> > > patches.
> > > 
> > > Benchmark                                               Delta(%): (+ is better)
> > > ===============================================================================
> > > Execl Throughput                                             +1.9356
> > > File Write 1024 bufsize 2000 maxblocks                       +6.5953
> > > Pipe Throughput                                              +9.5499
> > > Pipe-based Context Switching                                 +3.0209
> > > Process Creation                                             +2.3246
> > > Shell Scripts (1 concurrent)                                 +1.4975
> > > System Call Overhead                                         +2.7815
> > > System Benchmarks Index Score (Partial Only):                +3.4859
> > 
> > FTR, I also measure a ~3% tput improvement in UDP stream test over
> > loopback.
> > 
> 
> Thanks for running the numbers and testing these patches, greatly appreciated!
> 
> > @KP Singh, I would have appreciated being cc-ed here, since I provided
> 
> Definitely, a miss on my part. Will keep you Cc'ed in any future revisions.

Thanks!

> I think we can also add a Tested-by: tag on the main patch and add
> your performance numbers to the commit as well.

Feel free to include that, even if my testing is limited to the
performance test described above.

Cheers,

Paolo