On Tue, Jun 06, 2023 at 03:20:13PM -0400, Liam R. Howlett wrote: > * Yu Ma <yu.ma@xxxxxxxxx> [230606 08:23]: > > UnixBench/Execl represents a class of workload where bash scripts are > > spawned frequently to do some short jobs. When running multiple parallel > > tasks, hot osq_lock is observed from do_mmap and exit_mmap. Both of them > > come from load_elf_binary through the call chain > > "execl->do_execveat_common->bprm_execve->load_elf_binary". In do_mmap,it will > > call mmap_region to create vma node, initialize it and insert it to vma > > maintain structure in mm_struct and i_mmap tree of the mapping file, then > > increase map_count to record the number of vma nodes used. The hot osq_lock > > is to protect operations on file’s i_mmap tree. For the mm_struct member > > change like vma insertion and map_count update, they do not affect i_mmap > > tree. Move those operations out of the lock's critical section, to reduce > > hold time on the lock. > > > > With this change, on Intel Sapphire Rapids 112C/224T platform, based on > > v6.0-rc6, the 160 parallel score improves by 12%. The patch has no > > obvious performance gain on v6.4-rc4 due to regression of this benchmark > > from this commit f1a7941243c102a44e8847e3b94ff4ff3ec56f25 (mm: convert > > mm's rss stats into percpu_counter). > > I didn't think it was safe to insert a VMA into the VMA tree without > holding this write lock? We now have a window of time where a file > mapping doesn't exist for a vma that's in the tree? Is this always > safe? Does the locking order in mm/rmap.c need to change? We hold mmap lock on write here, right? Who can observe the VMA until the lock is released? It cannot be retrieved from the VMA tree as it requires at least read mmap lock. And the VMA doesn't exist anywhere else. I believe the change is safe. -- Kiryl Shutsemau / Kirill A. Shutemov