On Thu, Oct 18, 2018 at 3:11 PM Jan Kara <jack@xxxxxxx> wrote: > > On Thu 18-10-18 14:22:55, Amir Goldstein wrote: > > Kernel test robot reported [1] a 6% performance regression in a > > concurrent unlink(2) workload on commit 60f7ed8c7c4d ("fsnotify: send > > path type events to group with super block marks"). > > > > The performance test was run with no fsnotify marks at all on the > > data set, so the only extra instructions added by the offending > > commit are tests of the super_block fields s_fsnotify_{marks,mask} > > and these tests happen on almost every single inode access. > > > > When adding those fields to the super_block struct, we did not give much > > thought of placing them on a hot cache lines (we just placed them at the > > end of the struct). > > > > Re-organize struct super_block to try and keep some frequently accessed > > fields on the same cache line. > > > > Move the frequently accessed fields s_fsnotify_{marks,mask} near the > > frequently accessed fields s_fs_info,s_time_gran, while filling a 64bit > > alignment hole after s_time_gran. > > > > Move the seldom accessed fields s_id,s_uuid,s_max_links,s_mode near the > > seldom accessed fields s_vfs_rename_mutex,s_subtype. > > > > Rong Chen confirmed that this patch solved the reported problem. > > ... > > > + /* START frequently accessed fields block */ > > The movement of struct entries is fine. But I don't think the comments > about START / END of sections are really useful there are much more entries > in the struct which are frequently or seldomly accesses and furthemore it's > going to depend on the workload. > > Also how are you going to make sure the entries are going to fall into the > same cache line? Different architectures are going to have different > cacheline size and also the offset is going to be different... OTOH > s_writers is usually relatively frequently accessed so probably it doesn't > matter too much. > > So maybe just change the comment to something like: > > /* > * Keep s_fs_info, s_time_gran, s_fsnotify_mask, and > * s_fsnotify_marks together for cache efficiency. They are frequently > * accessed and rarely modified. > */ > Yes, fine by me. I am assuming you are making those changes on apply. I was going to include s_writers and s_dquot in the "frequently accessed block", they are also quite frequently accessed, but only on write access patterns, but those fields are too big to try to optimize cache lines, so I kept them out of the comment. Thanks, Amir.