On 3/10/2024 11:56 PM, Paul E. McKenney wrote: > On Sun, Mar 10, 2024 at 08:48:28PM -0400, Joel Fernandes wrote: >> On 3/10/2024 2:56 PM, Paul E. McKenney wrote: >>> On Sun, Mar 10, 2024 at 06:03:30AM -0400, Joel Fernandes wrote: >>>> Hello Ankur and Paul, >>>> >>>> On Mon, Feb 12, 2024 at 09:55:39PM -0800, Ankur Arora wrote: >>>>> With PREEMPT_RCU=n, cond_resched() provides urgently needed quiescent >>>>> states for read-side critical sections via rcu_all_qs(). >>>>> One reason why this was necessary: lacking preempt-count, the tick >>>>> handler has no way of knowing whether it is executing in a read-side >>>>> critical section or not. >>>>> >>>>> With PREEMPT_AUTO=y, there can be configurations with (PREEMPT_COUNT=y, >>>>> PREEMPT_RCU=n). This means that cond_resched() is a stub which does >>>>> not provide for quiescent states via rcu_all_qs(). >>>>> >>>>> So, use the availability of preempt_count() to report quiescent states >>>>> in rcu_flavor_sched_clock_irq(). >>>>> >>>>> Suggested-by: Paul E. McKenney <paulmck@xxxxxxxxxx> >>>>> Signed-off-by: Ankur Arora <ankur.a.arora@xxxxxxxxxx> >>>>> --- >>>>> kernel/rcu/tree_plugin.h | 11 +++++++---- >>>>> 1 file changed, 7 insertions(+), 4 deletions(-) >>>>> >>>>> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h >>>>> index 26c79246873a..9b72e9d2b6fe 100644 >>>>> --- a/kernel/rcu/tree_plugin.h >>>>> +++ b/kernel/rcu/tree_plugin.h >>>>> @@ -963,13 +963,16 @@ static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp) >>>>> */ >>>>> static void rcu_flavor_sched_clock_irq(int user) >>>>> { >>>>> - if (user || rcu_is_cpu_rrupt_from_idle()) { >>>>> + if (user || rcu_is_cpu_rrupt_from_idle() || >>>>> + (IS_ENABLED(CONFIG_PREEMPT_COUNT) && >>>>> + !(preempt_count() & (PREEMPT_MASK | SOFTIRQ_MASK)))) { >>>> >>>> I was wondering if it makes sense to even support !PREEMPT_RCU under >>>> CONFIG_PREEMPT_AUTO. >>>> >>>> AFAIU, this CONFIG_PREEMPT_AUTO series preempts the kernel on >>>> the next tick boundary in the worst case, with all preempt modes including >>>> the preempt=none mode. >>>> >>>> Considering this, does it makes sense for RCU to be non-preemptible in >>>> CONFIG_PREEMPT_AUTO=y? Because if that were the case, and a read-side critical >>>> section extended beyond the tick, then it prevents the PREEMPT_AUTO preemption >>>> from happening, because rcu_read_lock() would preempt_disable(). >>> >>> Yes, it does make sense for RCU to be non-preemptible in kernels >>> built with CONFIG_PREEMPT_AUTO=y and either CONFIG_PREEMPT_NONE=y or >>> CONFIG_PREEMPT_VOLUNTARY=y. >>> As noted in earlier discussions, there are >> >> Sorry if I missed a discussion, appreciate a link. > > It is part of the discussion of the first version of this patch series, > if I recall correctly. > >>> systems that are adequately but not abundantly endowed with memory. >>> Such systems need non-preemptible RCU to avoid preempted-reader OOMs. >> >> Then why don't such systems have a problem with CONFIG_PREEMPT_DYNAMIC=y and >> preempt=none mode? CONFIG_PREEMPT_DYNAMIC forces CONFIG_PREEMPT_RCU=y. There's >> no way to set CONFIG_PREEMPT_RCU=n with CONFIG_PREEMPT_DYNAMIC=y and >> preempt=none boot parameter. IMHO, if this feature is inconsistent with >> CONFIG_PREEMPT_DYNAMIC, that makes it super confusing. In fact, I feel >> CONFIG_PREEMPT_AUTO should instead just be another "preempt=auto" boot parameter >> mode added to CONFIG_PREEMPT_DYNAMIC feature, otherwise the proliferation of >> CONFIG_PREEMPT config options is getting a bit insane. And likely going to be >> burden to the users configuring the PREEMPT Kconfig option IMHO. > > Because such systems are built with CONFIG_PREEMPT_DYNAMIC=n. > > You could argue that we should just build with CONFIG_PREEMPT_AUTO=n, > but the long-term goal of eliminating cond_resched() will make that > ineffective. I see what you mean. We/I could also highlight some of the differences in RCU between DYNAMIC vs AUTO. > >>> Note well that non-preemptible RCU explicitly disables preemption across >>> all RCU readers. >> >> Yes, I mentioned this 'disabling preemption' aspect in my last email. My point >> being, unlike CONFIG_PREEMPT_NONE, CONFIG_PREEMPT_AUTO allows for kernel >> preemption in preempt=none. So the "Don't preempt the kernel" behavior has >> changed. That is, preempt=none under CONFIG_PREEMPT_AUTO is different from >> CONFIG_PREEMPT_NONE=y already. Here we *are* preempting. And RCU is getting on >> the way. It is like saying, you want an option for CONFIG_PREEMPT_RCU to be set >> to =n for CONFIG_PREEMPT=y kernels, sighting users who want a fully-preemptible >> kernel but are worried about reader preemptions. > > Such users can simply avoid building with either CONFIG_PREEMPT_NONE=y > or with CONFIG_PREEMPT_VOLUNTARY=y. They might also experiment with > CONFIG_RCU_BOOST=y, and also with short timeouts until boosting. > If that doesn't do what they need, we talk with them and either help > them configure their kernels, make RCU do what they need, or help work > out some other way for them to get their jobs done. Makes sense. >> That aside, as such, I do agree your point of view, that preemptible readers >> presents a problem to folks using preempt=none in this series and we could >> decide to keep CONFIG_PREEMPT_RCU optional for whoever wants it that way. I was >> just saying that I want CONFIG_PREEMPT_AUTO's preempt=none mode to be somewhat >> consistent with CONFIG_PREEMPT_DYNAMIC's preempt=none. Because I'm pretty sure a >> week from now, no one will likely be able to tell the difference ;-). So IMHO >> either CONFIG_PREEMPT_DYNAMIC should be changed to make CONFIG_PREEMPT_RCU >> optional, or this series should be altered to force CONFIG_PREEMPT_RCU=y. >> >> Let me know if I missed something. > > Why not key off of the value of CONFIG_PREEMPT_DYNAMIC? That way, > if both CONFIG_PREEMPT_AUTO=y and CONFIG_PREEMPT_DYNAMIC=y, RCU is > always preemptible. Then CONFIG_PREEMPT_DYNAMIC=y enables boot-time > (and maybe even run-time) switching between preemption flavors, while > CONFIG_PREEMPT_AUTO instead enables unconditional preemption of any > region of code that has not explicitly disabled preemption (or irq or > bh or whatever). That could be done. But currently, these patches disable DYNAMIC if AUTO is enabled in the config. I think the reason is the 2 features are incompatible. i.e. DYNAMIC wants to override the preemption mode at boot time, where as AUTO wants the scheduler to have a say in it using the need-resched LAZY bit. > But one way or another, we really do need non-preemptible RCU in > conjunction with CONFIG_PREEMPT_AUTO=y. Ok, lets go with that. Thanks, - Joel