Re: [PATCH 15/30] rcu: handle quiescent states for PREEMPT_RCU=n, PREEMPT_COUNT=y

Thomas Gleixner <tglx@xxxxxxxxxxxxx> · Mon, 11 Mar 2024 20:12:58 +0100

On Mon, Mar 11 2024 at 11:25, Joel Fernandes wrote:
> On 3/11/2024 1:18 AM, Ankur Arora wrote:
>>> Yes, I mentioned this 'disabling preemption' aspect in my last email. My point
>>> being, unlike CONFIG_PREEMPT_NONE, CONFIG_PREEMPT_AUTO allows for kernel
>>> preemption in preempt=none. So the "Don't preempt the kernel" behavior has
>>> changed. That is, preempt=none under CONFIG_PREEMPT_AUTO is different from
>>> CONFIG_PREEMPT_NONE=y already. Here we *are* preempting. And RCU is getting on
>> 
>> I think that's a view from too close to the implementation. Someone
>> using the kernel is not necessarily concered with whether tasks are
>> preempted or not. They are concerned with throughput and latency.
>
> No, we are not only talking about that (throughput/latency). We are also talking
> about the issue related to RCU reader-preemption causing OOM (well and that
> could hurt both throughput and latency as well).

That happens only when PREEMPT_RCU=y. For PREEMPT_RCU=n the read side
critical sections still have preemption disabled.

> With CONFIG_PREEMPT_AUTO=y, you now preempt in the preempt=none mode. Something
> very different from the classical CONFIG_PREEMPT_NONE=y.

In PREEMPT_RCU=y and preempt=none mode this happens only when really
required, i.e. when the task does not schedule out or returns to user
space on time, or when a higher scheduling class task gets runnable. For
the latter the jury is still out whether this should be done or just
lazily defered like the SCHED_OTHER preemption requests.

In any case for that to matter this forced preemption would need to
preempt a RCU read side critical section and then keep the preempted
task away from the CPU for a long time.

That's very different from the unconditional kernel preemption model which
preempt=full provides and only marginally different from the existing
PREEMPT_NONE model. I know there might be dragons, but I'm not convinced
yet that this is an actual problem.

OTOH, doesn't PREEMPT_RCU=y have mechanism to mitigate that already?

> Essentially this means preemption is now more aggressive from the point of view
> of a preempt=none user. I was suggesting that, a point of view could be RCU
> should always support preepmtiblity (don't give PREEEMPT_RCU=n option) because
> AUTO *does preempt* unlike classic CONFIG_PREEMPT_NONE. Otherwise it is
> inconsistent -- say with CONFIG_PREEMPT=y (another *preemption mode*) which
> forces CONFIG_PREEMPT_RCU. However to Paul's point, we need to worry about those
> users who are concerned with running out of memory due to reader
> preemption.

What's wrong with the combination of PREEMPT_AUTO=y and PREEMPT_RCU=n?
Paul and me agreed long ago that this needs to be supported.

> In that vain, maybe we should also support CONFIG_PREEMPT_RCU=n for
> CONFIG_PREEMPT=y as well. There are plenty of popular systems with relatively
> low memory that need low latency (like some low-end devices / laptops
> :-)).

I'm not sure whether that's useful as the goal is to get rid of all the
CONFIG_PREEMPT_FOO options, no?

I'd rather spend brain cycles on figuring out whether RCU can be flipped
over between PREEMPT_RCU=n/y at boot or obviously run-time.

Thanks,

        tglx