On Thu, Aug 13, 2020 at 9:11 AM Chinwen Chang <chinwen.chang@xxxxxxxxxxxx> wrote: > On Thu, 2020-08-13 at 02:53 -0700, Michel Lespinasse wrote: > > On Wed, Aug 12, 2020 at 7:14 PM Chinwen Chang > > <chinwen.chang@xxxxxxxxxxxx> wrote: > > > Recently, we have observed some janky issues caused by unpleasantly long > > > contention on mmap_lock which is held by smaps_rollup when probing large > > > processes. To address the problem, we let smaps_rollup detect if anyone > > > wants to acquire mmap_lock for write attempts. If yes, just release the > > > lock temporarily to ease the contention. > > > > > > smaps_rollup is a procfs interface which allows users to summarize the > > > process's memory usage without the overhead of seq_* calls. Android uses it > > > to sample the memory usage of various processes to balance its memory pool > > > sizes. If no one wants to take the lock for write requests, smaps_rollup > > > with this patch will behave like the original one. > > > > > > Although there are on-going mmap_lock optimizations like range-based locks, > > > the lock applied to smaps_rollup would be the coarse one, which is hard to > > > avoid the occurrence of aforementioned issues. So the detection and > > > temporary release for write attempts on mmap_lock in smaps_rollup is still > > > necessary. > > > > I do not mind extending the mmap lock API as needed. However, in the > > past I have tried adding rwsem_is_contended to mlock(), and later to > > mm_populate() paths, and IIRC gotten pushback on it both times. I > > don't feel strongly on this, but would prefer if someone else approved > > the rwsem_is_contended() use case. > > > Hi Michel, > > Thank you for your kind feedback. > > In my opinion, the difference between the case in smaps_rollup and the > one in your example is that, for the former, the interference comes from > the outside of the affected process, for the latter, it doesn't. > > In other words, anyone may use smaps_rollup to probe the affected > process without the information about what step the affected one is > executing. Thanks, that is a good enough point for me :) -- Michel "Walken" Lespinasse A program is never fully debugged until the last user dies.