On Sat, Jan 14, 2023 at 03:18:32PM +0800, Hillf Danton wrote: > On Fri, 13 Jan 2023 16:17:59 -0800 Boqun Feng <boqun.feng@xxxxxxxxx> > > On Sat, Jan 14, 2023 at 07:58:09AM +0800, Hillf Danton wrote: > > > On 13 Jan 2023 09:58:10 -0800 Boqun Feng <boqun.feng@xxxxxxxxx> > > > > On Fri, Jan 13, 2023 at 09:03:30PM +0800, Hillf Danton wrote: > > > > > On 12 Jan 2023 22:59:54 -0800 Boqun Feng <boqun.feng@xxxxxxxxx> > > > > > > --- a/kernel/rcu/srcutree.c > > > > > > +++ b/kernel/rcu/srcutree.c > > > > > > @@ -1267,6 +1267,8 @@ static void __synchronize_srcu(struct srcu_struct *ssp, bool do_norm) > > > > > > { > > > > > > struct rcu_synchronize rcu; > > > > > > > > > > > > + srcu_lock_sync(&ssp->dep_map); > > > > > > + > > > > > > RCU_LOCKDEP_WARN(lockdep_is_held(ssp) || > > > > > > lock_is_held(&rcu_bh_lock_map) || > > > > > > lock_is_held(&rcu_lock_map) || > > > > > > -- > > > > > > 2.38.1 > > > > > > > > > > The following deadlock is able to escape srcu_lock_sync() because the > > > > > __lock_release folded in sync leaves one lock on the sync side. > > > > > > > > > > cpu9 cpu0 > > > > > --- --- > > > > > lock A srcu_lock_acquire(&ssp->dep_map); > > > > > srcu_lock_sync(&ssp->dep_map); > > > > > lock A > > > > > > > > But isn't it just the srcu_mutex_ABBA test case in patch #3, and my run > > > > of lockdep selftest shows we can catch it. Anything subtle I'm missing? > > > > > > I am leaning to not call it ABBA deadlock, because B is unlocked. > > > > > > task X task Y > > > --- --- > > > lock A > > > lock B > > > lock B > > > unlock B > > > wait_for_completion E > > > > > > lock A > > > complete E > > > > > > And no deadlock should be detected/caught after B goes home. > > > > Your example makes me more confused.. given the case: > > > > task X task Y > > --- --- > > mutex_lock(A); > > srcu_read_lock(B); > > synchronze_srcu(B); > > mutex_lock(A); > > > > isn't it a deadlock? > > Yes and nope, see below. > > > If your example, A, B or E which one is srcu? > > A and B are mutex, and E is completion in my example to show the failure > of catching deadlock in case of non-fake lock. Now see srcu after your change. > > task X task Y > --- --- > mutex_lock(A); > srcu_read_lock(B); > srcu_lock_acquire(&B->dep_map); > a) lock_map_acquire_read(&B->dep_map); > synchronze_srcu(B); > __synchronize_srcu(B); > srcu_lock_sync(&B->dep_map); > lock_map_sync(&B->dep_map); > lock_sync(&B->dep_map); > __lock_acquire(&B->dep_map); At this time, lockdep add dependency A -> B in the dependency graph. > b) lock_map_acquire_read(&B->dep_map); > __lock_release(&B->dep_map); > c) lock_map_acquire_read(&B->dep_map); > mutex_lock(A); and here, lockdep will try to add dependency B -> A into the dependency graph, and find that A -> B -> A will form a circle (with strong dependency), therefore lockdep knows it's a deadlock. > > No deadlock could be detected if taskY takes mutexA after taskX releases B, The timing that taskX releases B doesn't master, since lockdep uses graph to detect deadlocks rather than after-fact detection. > and how taskY acquires B does not matter as per the a), b) and c) modes in > the above chart, again because releasing lock can break deadlock in general. I have test cases showing the above deadlock can be detected, so if you think there is a deadlock that may dodge from my change, feel free to add a test case in lib/locking-selftest.c ;-) Regards, Boqun