Hi Byungchul, Thank you for the great feature. Currently, DEPT has a bug in the 'dept_key_destroy()' function that must be fixed to ensure proper operation in the upstream Linux kernel. On 5/8/24 6:46 오후, Byungchul Park wrote: > CURRENT STATUS > -------------- > Lockdep tracks acquisition order of locks in order to detect deadlock, > and IRQ and IRQ enable/disable state as well to take accident > acquisitions into account. > > Lockdep should be turned off once it detects and reports a deadlock > since the data structure and algorithm are not reusable after detection > because of the complex design. > > PROBLEM > ------- > *Waits* and their *events* that never reach eventually cause deadlock. > However, Lockdep is only interested in lock acquisition order, forcing > to emulate lock acqusition even for just waits and events that have > nothing to do with real lock. > > Even worse, no one likes Lockdep's false positive detection because that > prevents further one that might be more valuable. That's why all the > kernel developers are sensitive to Lockdep's false positive. > > Besides those, by tracking acquisition order, it cannot correctly deal > with read lock and cross-event e.g. wait_for_completion()/complete() for > deadlock detection. Lockdep is no longer a good tool for that purpose. > > SOLUTION > -------- > Again, *waits* and their *events* that never reach eventually cause > deadlock. The new solution, Dept(DEPendency Tracker), focuses on waits > and events themselves. Dept tracks waits and events and report it if > any event would be never reachable. > > Dept does: > . Works with read lock in the right way. > . Works with any wait and event e.i. cross-event. > . Continue to work even after reporting multiple times. > . Provides simple and intuitive APIs. > . Does exactly what dependency checker should do. > > Q & A > ----- > Q. Is this the first try ever to address the problem? > A. No. Cross-release feature (b09be676e0ff2 locking/lockdep: Implement > the 'crossrelease' feature) addressed it 2 years ago that was a > Lockdep extension and merged but reverted shortly because: > > Cross-release started to report valuable hidden problems but started > to give report false positive reports as well. For sure, no one > likes Lockdep's false positive reports since it makes Lockdep stop, > preventing reporting further real problems. > > Q. Why not Dept was developed as an extension of Lockdep? > A. Lockdep definitely includes all the efforts great developers have > made for a long time so as to be quite stable enough. But I had to > design and implement newly because of the following: > > 1) Lockdep was designed to track lock acquisition order. The APIs and > implementation do not fit on wait-event model. > 2) Lockdep is turned off on detection including false positive. Which > is terrible and prevents developing any extension for stronger > detection. > > Q. Do you intend to totally replace Lockdep? > A. No. Lockdep also checks if lock usage is correct. Of course, the > dependency check routine should be replaced but the other functions > should be still there. > > Q. Do you mean the dependency check routine should be replaced right > away? > A. No. I admit Lockdep is stable enough thanks to great efforts kernel > developers have made. Lockdep and Dept, both should be in the kernel > until Dept gets considered stable. > > Q. Stronger detection capability would give more false positive report. > Which was a big problem when cross-release was introduced. Is it ok > with Dept? > A. It's ok. Dept allows multiple reporting thanks to simple and quite > generalized design. Of course, false positive reports should be fixed > anyway but it's no longer as a critical problem as it was. > > Signed-off-by: Byungchul Park <byungchul@xxxxxx> If a module previously checked for dependencies by DEPT is loaded and then would be unloaded, a kernel panic shall occur when the kernel reuses the corresponding memory area for other purposes. This issue must be addressed as a priority to enable the use of DEPT. Testing this patch on the Ubuntu kernel confirms the problem. > +void dept_key_destroy(struct dept_key *k) > +{ > + struct dept_task *dt = dept_task(); > + unsigned long flags; > + int sub_id; > + > + if (unlikely(!dept_working())) > + return; > + > + if (dt->recursive == 1 && dt->task_exit) { > + /* > + * Need to allow to go ahead in this case where > + * ->recursive has been set to 1 by dept_off() in > + * dept_task_exit() and ->task_exit has been set to > + * true in dept_task_exit(). > + */ > + } else if (dt->recursive) { > + DEPT_STOP("Key destroying fails.\n"); > + return; > + } > + > + flags = dept_enter(); > + > + /* > + * dept_key_destroy() should not fail. > + * > + * FIXME: Should be fixed if dept_key_destroy() causes deadlock > + * with dept_lock(). > + */ > + while (unlikely(!dept_lock())) > + cpu_relax(); > + > + for (sub_id = 0; sub_id < DEPT_MAX_SUBCLASSES; sub_id++) { > + struct dept_class *c; > + > + c = lookup_class((unsigned long)k->base + sub_id); > + if (!c) > + continue; > + > + hash_del_class(c); > + disconnect_class(c); > + list_del(&c->all_node); > + invalidate_class(c); > + > + /* > + * Actual deletion will happen on the rcu callback > + * that has been added in disconnect_class(). > + */ > + del_class(c); > + } > + > + dept_unlock(); > + dept_exit(flags); > + > + /* > + * Wait until even lockless hash_lookup_class() for the class > + * returns NULL. > + */ > + might_sleep(); > + synchronize_rcu(); > +} > +EXPORT_SYMBOL_GPL(dept_key_destroy); Best regards, Yunseong Kim