On Tue, Aug 11, 2020 at 2:05 PM <peterz@xxxxxxxxxxxxx> wrote: > > On Tue, Aug 11, 2020 at 12:03:51PM -0500, Uriel Guajardo wrote: > > On Mon, Aug 10, 2020 at 4:43 PM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > > > > > > On Mon, Aug 10, 2020 at 09:32:57PM +0000, Uriel Guajardo wrote: > > > > +static inline void kunit_check_locking_bugs(struct kunit *test, > > > > + unsigned long saved_preempt_count) > > > > +{ > > > > + preempt_count_set(saved_preempt_count); > > > > +#ifdef CONFIG_TRACE_IRQFLAGS > > > > + if (softirq_count()) > > > > + current->softirqs_enabled = 0; > > > > + else > > > > + current->softirqs_enabled = 1; > > > > +#endif > > > > +#if IS_ENABLED(CONFIG_LOCKDEP) > > > > + local_irq_disable(); > > > > + if (!debug_locks) { > > > > + kunit_set_failure(test); > > > > + lockdep_reset(); > > > > + } > > > > + local_irq_enable(); > > > > +#endif > > > > +} > > > > > > Unless you can guarantee this runs before SMP brinup, that > > > lockdep_reset() is terminally broken. > > > > Good point. KUnit is initialized after SMP is set up, and KUnit can > > also be built as a module, so it's not a guarantee that we can make. > > Even if you could, there's still the question of wether throwing out all > the dependencies learned during boot is a sensible idea. > > > Is there any other way to turn lockdep back on after we detect a > > failure? It would be ideal if lockdep could still run in the next test > > case after a failure in a previous one. > > Not really; the moment lockdep reports a failure it turns off all > tracking and we instantly loose state. > > You'd have to: > > - delete the 'mistaken' dependency from the graph such that we loose > the cycle, otherwise it will continue to find and report the cycle. > > - put every task through a known empty state which turns the tracking > back on. > > Bart implemented most of what you need for the first item last year or > so, but the remaining bit and the second item would still be a fair > amount of work. > > Also, I'm really not sure it's worth it, the kernel should be free of > lock cycles, so just fix one, reboot and continue. > > > I suppose we could only display the first failure that occurs, similar > > to how lockdep does it. But it could also be useful to developers if > > they saw failures in subsequent test cases, with the knowledge that > > those failures may be unreliable. > > People already struggle with lockdep reports enough; I really don't want > to given them dodgy report to worry about. Ah, ok! Fair enough, thanks for the info. Although resetting lockdep would be nice to have in the future, I think it's enough to only report the first failure and warn the user that further test cases will have lockdep disabled. People can then fix the issue and then re-run it. I'll follow up with a patch that does this.