On 1/15/24 19:38, Suren Baghdasaryan wrote: Hi, > The issue this patchset is trying to address is mmap_lock contention when > a low priority task (monitoring, data collecting, etc.) blocks a higher > priority task from making updated to the address space. The contention is > due to the mmap_lock being held for read when reading proc/pid/maps. > With maple_tree introduction, VMA tree traversals are RCU-safe and per-vma > locks make VMA access RCU-safe. this provides an opportunity for lock-less > reading of proc/pid/maps. We still need to overcome a couple obstacles: > 1. Make all VMA pointer fields used for proc/pid/maps content generation > RCU-safe; > 2. Ensure that proc/pid/maps data tearing, which is currently possible at > page boundaries only, does not get worse. Hm I thought we were to only choose this more complicated in case additional tearing becomes a problem, and at first assume that if software can deal with page boundary tearing, it can deal with sub-page tearing too? > The patchset deals with these issues but there is a downside which I would > like to get input on: > This change introduces unfairness towards the reader of proc/pid/maps, > which can be blocked by an overly active/malicious address space modifyer. So this is a consequence of the validate() operation, right? We could avoid this if we allowed sub-page tearing. > A couple of ways I though we can address this issue are: > 1. After several lock-less retries (or some time limit) to fall back to > taking mmap_lock. > 2. Employ lock-less reading only if the reader has low priority, > indicating that blocking it is not critical. > 3. Introducing a separate procfs file which publishes the same data in > lock-less manner. > > I imagine a combination of these approaches can also be employed. > I would like to get feedback on this from the Linux community. > > Note: mmap_read_lock/mmap_read_unlock sequence inside validate_map() > can be replaced with more efficiend rwsem_wait() proposed by Matthew > in [1]. > > [1] https://lore.kernel.org/all/ZZ1+ZicgN8dZ3zj3@xxxxxxxxxxxxxxxxxxxx/ > > Suren Baghdasaryan (3): > mm: make vm_area_struct anon_name field RCU-safe > seq_file: add validate() operation to seq_operations > mm/maps: read proc/pid/maps under RCU > > fs/proc/internal.h | 3 + > fs/proc/task_mmu.c | 130 ++++++++++++++++++++++++++++++++++---- > fs/seq_file.c | 24 ++++++- > include/linux/mm_inline.h | 10 ++- > include/linux/mm_types.h | 3 +- > include/linux/seq_file.h | 1 + > mm/madvise.c | 30 +++++++-- > 7 files changed, 181 insertions(+), 20 deletions(-) >