On Wed, 28 Jun 2023 18:56:25 +0800 "Zhu, Lipeng" <lipeng.zhu@xxxxxxxxx> wrote: > When running UnixBench/Shell Scripts, we observed high false sharing > for accessing i_mmap against i_mmap_rwsem. > > UnixBench/Shell Scripts are typical load/execute command test scenarios, > the i_mmap will be accessed frequently to insert/remove vma_interval_tree. > Meanwhile, the i_mmap_rwsem is frequently loaded. Unfortunately, they are > in the same cacheline. That sounds odd. One would expect these two fields to be used in close conjunction, so any sharing might even be beneficial. Can you identify in more detail what's actually going on in there? > The patch places the i_mmap and i_mmap_rwsem in separate cache lines to avoid > this false sharing problem. > > With this patch, on Intel Sapphire Rapids 2 sockets 112c/224t platform, based > on kernel v6.4-rc4, the 224 parallel score is improved ~2.5% for > UnixBench/Shell Scripts case. And perf c2c tool shows the false sharing is > resolved as expected, the symbol vma_interval_tree_remove disappeared in > cache line 0 after this change. There can be many address_spaces in memory, so a size increase is a concern. Is there anything we can do to minimize the cost of this?