On Thu, Sep 03, 2020 at 05:08:19PM +0200, peterz@xxxxxxxxxxxxx wrote: > On Thu, Sep 03, 2020 at 04:36:35PM +0200, Ulf Hansson wrote: > > On Thu, 3 Sep 2020 at 15:53, <peterz@xxxxxxxxxxxxx> wrote: > > > static int cpu_pm_notify(enum cpu_pm_event event) > > > { > > > int ret; > > > > > > + lockdep_assert_irqs_disabled(); > > > > Nitpick, maybe the lockdep should be moved to a separate patch. > > Well, the unregister relies on IRQs being disabled here, so I figured > asserting this was a good thing ;-) > > Starting the audit below, this might not in fact be true, which then > invalidates the unregister implementation. In particular the notifier in > arch/arm/kernel/hw_breakpoint.c seems to unconditionally enable IRQs. > > > > + ret = raw_notifier_call_chain(&cpu_pm_notifier_chain, event, NULL); > > > > Converting to raw_notifiers seems reasonable - if we need to avoid the > > RCU usage. > > > > My point is, I wonder about if the notifier callbacks themselves are > > safe from RCU usage. For example, I would not be surprised if tracing > > is happening behind them. > > A bunch of them seem to call into the clk domain stuff, and I think > there's tracepoints in that. > > > Moreover, I am not sure that we really need to prevent and limit > > tracing from happening. Instead we could push rcu_idle_enter|exit() > > further down to the arch specific code in the cpuidle drivers, as you > > kind of all proposed earlier. > > Well, at some point the CPU is in a really dodgy state, ISTR there being > ARM platforms where you have to manually leave the cache coherency > fabric and all sorts of insanity. There should be a definite cut-off on > tracing before that. > > Also, what is the point of all this clock and power domain callbacks, if > not to put the CPU into an extremely low power state, surely you want to > limit the amount of code that's ran when the CPU is in such a state. > > > In this way, we can step by step, move to a new "version" of > > cpu_pm_enter() that doesn't have to deal with rcu_irq_enter_irqson(), > > because RCU hasn't been pushed to idle yet. > > That should be easy enough to audit. The thing is that mainline is now > generating (debug) splats, and some people are upset with this. > > If you're ok with ARM not being lockdep clean while this is being > reworked I'm perfectly fine with that. > > (There used to be a separate CONFIG for RCU-lockdep, but that seems to > have been removed) CONFIG_PROVE_RCU still gates RCU_LOCKDEP_WARN(), but it is now a def_bool that follows CONFIG_PROVE_LOCKING. It would not be hard to make CONFIG_PROVE_RCU separately settable only for arm, if that would help. Thanx, Paul