On Wed, May 11, 2022 at 07:04:51PM +0900, Hyeonggon Yoo wrote: > On Wed, May 11, 2022 at 08:39:29AM +0900, Byungchul Park wrote: > > On Tue, May 10, 2022 at 08:18:12PM +0900, Hyeonggon Yoo wrote: > > > On Mon, May 09, 2022 at 09:16:37AM +0900, Byungchul Park wrote: > > > > CASE 1. > > > > > > > > lock L with depth n > > > > lock_nested L' with depth n + 1 > > > > ... > > > > unlock L' > > > > unlock L > > > > > > > > This case is allowed by Lockdep. > > > > This case is allowed by DEPT cuz it's not a deadlock. > > > > > > > > CASE 2. > > > > > > > > lock L with depth n > > > > lock A > > > > lock_nested L' with depth n + 1 > > > > ... > > > > unlock L' > > > > unlock A > > > > unlock L > > > > > > > > This case is allowed by Lockdep. > > > > This case is *NOT* allowed by DEPT cuz it's a *DEADLOCK*. > > > > > > Yeah, in previous threads we discussed this [1] > > > > > > And the case was: > > > scan_mutex -> object_lock -> kmemleak_lock -> object_lock > > > And dept reported: > > > object_lock -> kmemleak_lock, kmemleak_lock -> object_lock as > > > deadlock. > > > > > > But IIUC - What DEPT reported happens only under scan_mutex and it > > > is not simple just not to take them because the object can be > > > removed from the list and freed while scanning via kmemleak_free() > > > without kmemleak_lock and object_lock. The above kmemleak sequence shouldn't deadlock since those locks, even if taken in a different order, are serialised by scan_mutex. For various reasons, trying to reduce the latency, I ended up with some fine-grained, per-object locking. For object allocation (rbtree modification) and tree search, we use kmemleak_lock. During scanning (which can take minutes under scan_mutex), we want to prevent (a) long latencies and (b) freeing the object being scanned. We release the locks regularly for (a) and hold the object->lock for (b). In another thread Byungchul mentioned: | context X context Y | | lock mutex A lock mutex A | lock B lock C | lock C lock B | unlock C unlock B | unlock B unlock C | unlock mutex A unlock mutex A | | In my opinion, lock B and lock C are unnecessary if they are always | along with lock mutex A. Or we should keep correct lock order across all | the code. If these are the only two places, yes, locks B and C would be unnecessary. But we have those locks acquired (not nested) on the allocation path (kmemleak_lock) and freeing path (object->lock). We don't want to block those paths while scan_mutex is held. That said, we may be able to use a single kmemleak_lock for everything. The object freeing path may be affected slightly during scanning but the code does release it every MAX_SCAN_SIZE bytes. It may even get slightly faster as we'd hammer a single lock (I'll do some benchmarks). But from a correctness perspective, I think the DEPT tool should be improved a bit to detect when such out of order locking is serialised by an enclosing lock/mutex. -- Catalin