On 2019-08-26 09:29:45 [-0700], Paul E. McKenney wrote: > > The mechanism that is used here may change in future. I just wanted to > > make sure that from RCU's side it is okay to schedule here. > > Good point. > > The effect from RCU's viewpoint will be to split any non-rcu_read_lock() > RCU read-side critical section at this point. This alrady happens in a > few places, for example, rcu_note_context_switch() constitutes an RCU > quiescent state despite being invoked with interrupts disabled (as is > required!). The __schedule() function just needs to understand (and does > understand) that the RCU read-side critical section that would otherwise > span that call to rcu_node_context_switch() is split in two by that call. Okay. So I read this as invoking schedule() at this point is okay. Looking at this again, this could also happen on a PREEMPT=y kernel if the kernel decides to preempt a task within a rcu_read_lock() section and put it back later on another CPU. > However, if this was instead an rcu_read_lock() critical section within > a PREEMPT=y kernel, then if a schedule() occured within stop_one_task(), > RCU would consider that critical section to be preempted. This means > that any RCU grace period that is blocked by this RCU read-side critical > section would remain blocked until stop_one_cpu() resumed, returned, > and so on until the matching rcu_read_unlock() was reached. In other > words, RCU would consider that RCU read-side critical section to span > the call to stop_one_cpu() even if stop_one_cpu() invoked schedule(). Isn't that my example from above and what we do in RT? My understanding is that this is the reason why we need BOOST on RT otherwise the RCU critical section could remain blocked for some time. > On the other hand, within a PREEMPT=n kernel, the call to schedule() > would split even an rcu_read_lock() critical section. Which is why I > asked earlier if sleeping_lock_inc() and sleeping_lock_dec() are no-ops > in !PREEMPT_RT_BASE kernels. We would after all want the usual lockdep > complaints in that case. sleeping_lock_inc() +dec() is only RT specific. It is part of RT's spin_lock() implementation and used by RCU (rcu_note_context_switch()) to not complain if invoked within a critical section. > Does that help, or am I missing the point? > > Thanx, Paul Sebastian