Re: [robertosassu:evm-iint-ptr-v1-devel-v3] [evm] e38e699a42: will-it-scale.per_process_ops 160.4% improvement

Roberto Sassu <roberto.sassu@xxxxxxxxxxxxxxx> · Mon, 17 Feb 2025 10:58:38 +0100

On Mon, 2025-02-17 at 08:58 +0100, Mateusz Guzik wrote:
> On Mon, Feb 17, 2025 at 02:45:23PM +0800, kernel test robot wrote:
> > kernel test robot noticed a 160.4% improvement of will-it-scale.per_process_ops on:
> > 
> > 
> > commit: e38e699a42b4db5daf7dac453759fdc8ba0dab31 ("evm: Move metadata in the inode security blob to a pointer")
> > https://github.com/robertosassu/linux evm-iint-ptr-v1-devel-v3
> > 
> >      24.57           +72.2       96.76        perf-profile.children.cycles-pp.__fput
> >      24.63           +72.4       97.02        perf-profile.children.cycles-pp.__x64_sys_close
> >      24.67           +72.5       97.17        perf-profile.children.cycles-pp.__close
> >       0.00           +95.2       95.17        perf-profile.children.cycles-pp.osq_lock
> >       0.00           +95.5       95.53        perf-profile.children.cycles-pp.__mutex_lock
> >       0.00           +95.7       95.71        perf-profile.children.cycles-pp.evm_file_release
> >       0.00           +95.9       95.87        perf-profile.children.cycles-pp.security_file_release
> >      98.69           -98.7        0.00        perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
> 
> Contrary to what's indicated in the report, this change is in fact a
> significant slowdown (or rather, will be when other problems get fixed).
> 
> The open3 microbenchmark issues open + close in a loop on the same file.
> 
> On the stock kernel the some of the problem is false-sharing within
> struct inode.
> 
> The biggest bottleneck is lockref manipulation:
> - there is lockref acquire and release happening *twice* instead of just
>   once
> - the lockref facility is prone to degrading to operation under a
>   spinlock and staying there when microbenchmarked. you can see on the
>   profile this does happen here
> 
> evm also used to pop up, which I patched away in 699ae6241920b0fa ("evm:
> stop avoidably reading i_writecount in evm_file_release")
> 
> Your patch adds a mutex which adds 2 atomics to the fast path (so slows
> down single-threaded operation) and more importantly adds a
> serialization point for multithreaded operation.
> 
> In this case the resulting contention helps decrease the loss of
> performance in lockref and that's how there is an apparent win.

Hi Mateusz

thanks for the explanation!

> I have a WIP patch to move dentries away from using lockref, which will
> in turn avoid the degradation. Should it land, the mutex added here will
> be the new bottleneck.
> 
> It needs to be avoided by default. Do you *need* to test the condition
> in evm_file_release() with the lock held? Perhaps initial test can be
> done without and redone after acquiring it?

This patch was more an explorative work to see what challenges we
encounter to move away from embedding the structure in the inode
security blob, and just use a pointer.

Currently, there is no gain in switching, since the requested blob size
remains 40 bytes (due to adding the mutex).

Certainly it is possible to do a test without a mutex and then redoing
it. If the EVM_NEW_FILE flag is not set, we can avoid to take the lock.

Thanks

Roberto