On Sun, Nov 24, 2024 at 10:34:02PM +0900, Yunseong Kim wrote: > Hi Byungchul, > > Thank you for the great feature. Currently, DEPT has a bug in the > 'dept_key_destroy()' function that must be fixed to ensure proper > operation in the upstream Linux kernel. > > On 5/8/24 6:46 오후, Byungchul Park wrote: > > CURRENT STATUS > > -------------- > > Lockdep tracks acquisition order of locks in order to detect deadlock, > > and IRQ and IRQ enable/disable state as well to take accident > > acquisitions into account. > > > > Lockdep should be turned off once it detects and reports a deadlock > > since the data structure and algorithm are not reusable after detection > > because of the complex design. > > > > PROBLEM > > ------- > > *Waits* and their *events* that never reach eventually cause deadlock. > > However, Lockdep is only interested in lock acquisition order, forcing > > to emulate lock acqusition even for just waits and events that have > > nothing to do with real lock. > > > > Even worse, no one likes Lockdep's false positive detection because that > > prevents further one that might be more valuable. That's why all the > > kernel developers are sensitive to Lockdep's false positive. > > > > Besides those, by tracking acquisition order, it cannot correctly deal > > with read lock and cross-event e.g. wait_for_completion()/complete() for > > deadlock detection. Lockdep is no longer a good tool for that purpose. > > > > SOLUTION > > -------- > > Again, *waits* and their *events* that never reach eventually cause > > deadlock. The new solution, Dept(DEPendency Tracker), focuses on waits > > and events themselves. Dept tracks waits and events and report it if > > any event would be never reachable. > > > > Dept does: > > . Works with read lock in the right way. > > . Works with any wait and event e.i. cross-event. > > . Continue to work even after reporting multiple times. > > . Provides simple and intuitive APIs. > > . Does exactly what dependency checker should do. > > > > Q & A > > ----- > > Q. Is this the first try ever to address the problem? > > A. No. Cross-release feature (b09be676e0ff2 locking/lockdep: Implement > > the 'crossrelease' feature) addressed it 2 years ago that was a > > Lockdep extension and merged but reverted shortly because: > > > > Cross-release started to report valuable hidden problems but started > > to give report false positive reports as well. For sure, no one > > likes Lockdep's false positive reports since it makes Lockdep stop, > > preventing reporting further real problems. > > > > Q. Why not Dept was developed as an extension of Lockdep? > > A. Lockdep definitely includes all the efforts great developers have > > made for a long time so as to be quite stable enough. But I had to > > design and implement newly because of the following: > > > > 1) Lockdep was designed to track lock acquisition order. The APIs and > > implementation do not fit on wait-event model. > > 2) Lockdep is turned off on detection including false positive. Which > > is terrible and prevents developing any extension for stronger > > detection. > > > > Q. Do you intend to totally replace Lockdep? > > A. No. Lockdep also checks if lock usage is correct. Of course, the > > dependency check routine should be replaced but the other functions > > should be still there. > > > > Q. Do you mean the dependency check routine should be replaced right > > away? > > A. No. I admit Lockdep is stable enough thanks to great efforts kernel > > developers have made. Lockdep and Dept, both should be in the kernel > > until Dept gets considered stable. > > > > Q. Stronger detection capability would give more false positive report. > > Which was a big problem when cross-release was introduced. Is it ok > > with Dept? > > A. It's ok. Dept allows multiple reporting thanks to simple and quite > > generalized design. Of course, false positive reports should be fixed > > anyway but it's no longer as a critical problem as it was. > > > > Signed-off-by: Byungchul Park <byungchul@xxxxxx> > > If a module previously checked for dependencies by DEPT is loaded and > then would be unloaded, a kernel panic shall occur when the kernel Hi, Thank you for sharing the issue. Yes. I'm aware of what you are mentioning. I will fix it with high priority. Thanks again. Byungchul > reuses the corresponding memory area for other purposes. This issue must > be addressed as a priority to enable the use of DEPT. Testing this patch > on the Ubuntu kernel confirms the problem. > > > +void dept_key_destroy(struct dept_key *k) > > +{ > > + struct dept_task *dt = dept_task(); > > + unsigned long flags; > > + int sub_id; > > + > > + if (unlikely(!dept_working())) > > + return; > > + > > + if (dt->recursive == 1 && dt->task_exit) { > > + /* > > + * Need to allow to go ahead in this case where > > + * ->recursive has been set to 1 by dept_off() in > > + * dept_task_exit() and ->task_exit has been set to > > + * true in dept_task_exit(). > > + */ > > + } else if (dt->recursive) { > > + DEPT_STOP("Key destroying fails.\n"); > > + return; > > + } > > + > > + flags = dept_enter(); > > + > > + /* > > + * dept_key_destroy() should not fail. > > + * > > + * FIXME: Should be fixed if dept_key_destroy() causes deadlock > > + * with dept_lock(). > > + */ > > + while (unlikely(!dept_lock())) > > + cpu_relax(); > > + > > + for (sub_id = 0; sub_id < DEPT_MAX_SUBCLASSES; sub_id++) { > > + struct dept_class *c; > > + > > + c = lookup_class((unsigned long)k->base + sub_id); > > + if (!c) > > + continue; > > + > > + hash_del_class(c); > > + disconnect_class(c); > > + list_del(&c->all_node); > > + invalidate_class(c); > > + > > + /* > > + * Actual deletion will happen on the rcu callback > > + * that has been added in disconnect_class(). > > + */ > > + del_class(c); > > + } > > + > > + dept_unlock(); > > + dept_exit(flags); > > + > > + /* > > + * Wait until even lockless hash_lookup_class() for the class > > + * returns NULL. > > + */ > > + might_sleep(); > > + synchronize_rcu(); > > +} > > +EXPORT_SYMBOL_GPL(dept_key_destroy); > > Best regards, > Yunseong Kim