On 8/31/21 22:40, Liam Howlett wrote: > > Hello, > > I've been trying to trace the mmap_lock calls using tracepoints and > captured this behaviour which I cannot explain. This was with the maple > tree v2 patches running ebizzy with multiple threads in an x86_64 KVM > using 8 virtual CPUs. > > AFAICT, there are zero callers that use the mmap_lock directly besides a > prefetchw(&mm->mmap_lock); > > > ebizzy-803 [000] .... 5376.655366: lock_release: 00000000de8cf25e lock > ebizzy-803 [000] .... 5376.655366: lock_release: 00000000b5e38448 ptlock_ptr(page) > ebizzy-803 [000] .... 5376.655367: lock_release: 000000006b581afd &mm->mmap_lock > ebizzy-803 [000] .... 5376.655367: mmap_lock_released: mm=000000001de7b122 memcg_path= write=false > > ebizzy-803 [000] .... 5376.655369: mmap_lock_start_locking: mm=000000001de7b122 memcg_path= write=false > > ebizzy-803 [000] .... 5376.655369: mmap_lock_acquire_returned: mm=000000001de7b122 memcg_path= write=false success=false > > ebizzy-803 [000] .... 5376.655369: mmap_lock_start_locking: mm=000000001de7b122 memcg_path= write=false I was thinking khugepaged interference, but this seems to be a system-wide tracing (judging from <idle>-0) so you would see khugepaged too, right? In the other hand it seems strange to have a long list of just cpu 0 here. Are other CPU's missing or just the interleaving is imperfect because timestamps are not perfectly in sync between cpus, and in fact there was another CPU who took the lock? > ebizzy-803 [000] .... 5376.655369: lock_acquire: 000000006b581afd read &mm->mmap_lock > ebizzy-803 [000] d... 5376.655369: lock_acquire: 00000000984d533e &sem->wait_lock > ebizzy-803 [000] d... 5376.655370: lock_release: 00000000984d533e &sem->wait_lock > ebizzy-803 [000] d... 5376.655370: lock_acquire: 00000000cbb1a456 &rq->__lock > ebizzy-803 [000] d... 5376.655371: lock_acquire: 00000000b3d53a93 read rcu_read_lock > ebizzy-803 [000] d... 5376.655371: lock_release: 00000000b3d53a93 rcu_read_lock > ebizzy-803 [000] d... 5376.655371: lock_acquire: 00000000b3d53a93 read rcu_read_lock > ebizzy-803 [000] d... 5376.655371: lock_release: 00000000b3d53a93 rcu_read_lock > ebizzy-803 [000] d... 5376.655372: lock_acquire: 00000000b3d53a93 read rcu_read_lock > ebizzy-803 [000] d... 5376.655372: lock_release: 00000000b3d53a93 rcu_read_lock > ebizzy-803 [000] d... 5376.655372: lock_release: 00000000cbb1a456 &rq->__lock > <idle>-0 [000] d... 5376.655373: lock_acquire: 00000000cbb1a456 &rq->__lock > <idle>-0 [000] d... 5376.655373: lock_release: 00000000cbb1a456 &rq->__lock > <idle>-0 [000] d... 5376.655373: lock_acquire: 00000000bb7b405b read tk_core.seq.seqcount > <idle>-0 [000] d... 5376.655374: lock_release: 00000000bb7b405b tk_core.seq.seqcount > <idle>-0 [000] d... 5376.655374: lock_acquire: 000000007c3ac380 read jiffies_seq.seqcount > <idle>-0 [000] d... 5376.655374: lock_release: 000000007c3ac380 jiffies_seq.seqcount > <idle>-0 [000] d.h. 5376.655460: lock_acquire: 00000000bb7b405b read tk_core.seq.seqcount > <idle>-0 [000] d.h. 5376.655460: lock_release: 00000000bb7b405b tk_core.seq.seqcount > <idle>-0 [000] d.h. 5376.655460: lock_acquire: 00000000f7c85b8f hrtimer_bases.lock > <idle>-0 [000] d.h. 5376.655461: lock_acquire: 00000000bb7b405b read tk_core.seq.seqcount > <idle>-0 [000] d.h. 5376.655461: lock_release: 00000000bb7b405b tk_core.seq.seqcount > <idle>-0 [000] d.h. 5376.655461: lock_release: 00000000f7c85b8f hrtimer_bases.lock > <idle>-0 [000] d.h. 5376.655462: lock_acquire: 00000000bb7b405b read tk_core.seq.seqcount > <idle>-0 [000] d.h. 5376.655462: lock_release: 00000000bb7b405b tk_core.seq.seqcount > <idle>-0 [000] d.h. 5376.655462: lock_acquire: 00000000cbb1a456 &rq->__lock > <idle>-0 [000] d.h. 5376.655463: lock_release: 00000000cbb1a456 &rq->__lock > <idle>-0 [000] d.h. 5376.655463: lock_acquire: 00000000f7c85b8f hrtimer_bases.lock > <idle>-0 [000] d.h. 5376.655464: lock_release: 00000000f7c85b8f hrtimer_bases.lock > <idle>-0 [000] d.h. 5376.655464: lock_acquire: 00000000bb7b405b read tk_core.seq.seqcount > <idle>-0 [000] d.h. 5376.655464: lock_release: 00000000bb7b405b tk_core.seq.seqcount > <idle>-0 [000] d... 5376.655466: lock_acquire: 00000000bb7b405b read tk_core.seq.seqcount > <idle>-0 [000] d... 5376.655466: lock_release: 00000000bb7b405b tk_core.seq.seqcount > <idle>-0 [000] d... 5376.655467: lock_acquire: 000000007c3ac380 read jiffies_seq.seqcount > <idle>-0 [000] d... 5376.655467: lock_release: 000000007c3ac380 jiffies_seq.seqcount > <idle>-0 [000] d.h. 5376.655488: lock_acquire: 00000000bb7b405b read tk_core.seq.seqcount > <idle>-0 [000] d.h. 5376.655489: lock_release: 00000000bb7b405b tk_core.seq.seqcount > <idle>-0 [000] d.h. 5376.655489: lock_acquire: 00000000cbb1a456 &rq->__lock > <idle>-0 [000] dNh. 5376.655490: lock_release: 00000000cbb1a456 &rq->__lock > <idle>-0 [000] dN.. 5376.655490: lock_acquire: 00000000cbb1a456 &rq->__lock > <idle>-0 [000] d... 5376.655491: lock_release: 00000000cbb1a456 &rq->__lock > ebizzy-803 [000] d... 5376.655491: lock_acquire: 00000000cbb1a456 &rq->__lock > ebizzy-803 [000] d... 5376.655492: lock_release: 00000000cbb1a456 &rq->__lock > ebizzy-803 [000] .... 5376.655492: mmap_lock_acquire_returned: mm=000000001de7b122 memcg_path= write=false success=true > > The following tracepoints were enabled: > events/mmap_lock/enable > events/lock/enable > > My reading of the above trace is that the ebizzy thread dropped the lock > and immediately attempted to reacquire and detected it was in contention > so the thread added itself to the list and went to sleep only to wake up > and get the lock? Which path does this mmap_read_trylock() followed by immediate mmap_read_lock() anyway? I mean down_read() is implemented like this internally, but that wouldn't generate these mmap_lock_ events. > Doesn't that mean the mmap_lock is _not_ contended? The lack of > contention makes sense since there is no tracepoint of an attempt to > acquire the lock between the dropping of the lock and the same MM waking > up to get the lock. > > What is even stranger is if I change the value of _Q_PENDING_LOOPS from > (1 << 9) to (1 << 10), the benchmark performs better. > > From the above trace and the effects of the _Q_PENDING_LOOPS change, it > looks like the lock is somehow remaining in a state that causes a > failure to acquire the lock even when it is not held or contended? > > > Thanks, > Liam >