On Fri, Mar 22, 2019 at 12:58:47PM +1100, NeilBrown wrote: > On Thu, Mar 21 2019, J. Bruce Fields wrote: > > > On Thu, Mar 21, 2019 at 07:27:44AM -0400, Jeff Layton wrote: > >> Andreas reported that he was seeing the tdbtorture test fail in > >> some cases with -EDEADLCK when it wasn't before. Some debugging > >> showed that deadlock detection was sometimes discovering the > >> caller's lock request itself in a dependency chain. > >> > >> If posix_locks_deadlock() fails to find a deadlock, the caller_fl > >> will be passed to __locks_insert_block(), and this wakes up all > >> locks that are blocked on caller_fl, clearing the fl_blocker link. > >> > >> So if posix_locks_deadlock() finds caller_fl while searching for > >> a deadlock, > > > > I'm feeling dense. Could you step me through the scenario in a little > > more detail? > > Three processes: A, B, C. One byte in one file: X > > A gets a read lock on X. Doesn't block, so nothing goes in the hash > table. > > B gets a read lock on X. Doesn't block. > > C tries to get a write lock on X. It cannot so it can block on either A > or B. It chooses B. It check if this might cause a deadlock, but the > hash is empty so it doesn't. It gets added to the hash. "C -> B" > > A tries to get a write lock on X. It cannot because B has a read lock, > so it needs to block on B, or some child. C is a child and A's > request conflicts with C's request, so A blocks on C. > It first checks: Does A->C cause a loop in the hash-table? No. So > it proceeds to block and adds "A->C" to the hash table. > > B drops its read lock. This wakes up the first child: C. > > C tries (again) to get a write lock on X. It cannot because A still > holds a read lock, so it will have to block on A. First it must > check if "C->A" will cause a loop in the hash-table ... Yes it will > because "A->C" is already there - error! > > This is a false-positive because the very next thing to be done after > posix_locks_deadlock() (while still holding all spinlocks), is > __locks_insert_block(C->A) which calls __locks_wake_up_blocks(C) which > calls __locks_delete_block() for any pending block ??->C and so will > remove A->C from the hash. > > Maybe we could call __locks_wake_up_block() *before* calling > posix_locks_deadlock(). That *might* make more sense. > > > > > > Also, how do we know this catches every such case? > > The only links in the hash table that are about to be removed are those > that record that something is blocked by C (the lock we are trying to > get). So these are the only ones that it is reasonable to special-case > in posix_locks_deadlock(). > > > > > And why aren't we unhashing blocks when we wake them up? > > We do - that is exactly when we unhash them. The problem is they we > wake them up *after* we check for errors, rather than before. Thanks! OK, somehow I overlooked __locks_wake_up_blocks->__locks_delete_block->locks_delete_global_block. And I missed that all this was happening while still under the blocked_lock_lock, so we don't have to worry about other tasks leaving locks in this state. It's possible I'm still being a little dense, but the new check could really use a careful comment (maybe even as detailed as the above). --b.