On Mon, May 29, 2023 at 10:06:26PM -0400, chenzhiyin wrote: > In the syscall test of UnixBench, performance regression occurred > due to false sharing. > > The lock and atomic members, including file::f_lock, file::f_count > and file::f_pos_lock are highly contended and frequently updated > in the high-concurrency test scenarios. perf c2c indentified one > affected read access, file::f_op. > To prevent false sharing, the layout of file struct is changed as > following > (A) f_lock, f_count and f_pos_lock are put together to share the > same cache line. > (B) The read mostly members, including f_path, f_inode, f_op are > put into a separate cache line. > (C) f_mode is put together with f_count, since they are used > frequently at the same time. > > The optimization has been validated in the syscall test of > UnixBench. performance gain is 30~50%, when the number of parallel > jobs is 16. > > Signed-off-by: chenzhiyin <zhiyin.chen@xxxxxxxxx> > --- Sounds interesting, but can we see the actual numbers, please? So struct file is marked with __randomize_layout which seems to make this whole reordering pointless or at least only useful if the structure randomization Kconfig is turned off. Is there any precedence to optimizing structures that are marked as randomizable?