On Mon, Mar 11 2024 at 11:25, Joel Fernandes wrote: > On 3/11/2024 1:18 AM, Ankur Arora wrote: >>> Yes, I mentioned this 'disabling preemption' aspect in my last email. My point >>> being, unlike CONFIG_PREEMPT_NONE, CONFIG_PREEMPT_AUTO allows for kernel >>> preemption in preempt=none. So the "Don't preempt the kernel" behavior has >>> changed. That is, preempt=none under CONFIG_PREEMPT_AUTO is different from >>> CONFIG_PREEMPT_NONE=y already. Here we *are* preempting. And RCU is getting on >> >> I think that's a view from too close to the implementation. Someone >> using the kernel is not necessarily concered with whether tasks are >> preempted or not. They are concerned with throughput and latency. > > No, we are not only talking about that (throughput/latency). We are also talking > about the issue related to RCU reader-preemption causing OOM (well and that > could hurt both throughput and latency as well). That happens only when PREEMPT_RCU=y. For PREEMPT_RCU=n the read side critical sections still have preemption disabled. > With CONFIG_PREEMPT_AUTO=y, you now preempt in the preempt=none mode. Something > very different from the classical CONFIG_PREEMPT_NONE=y. In PREEMPT_RCU=y and preempt=none mode this happens only when really required, i.e. when the task does not schedule out or returns to user space on time, or when a higher scheduling class task gets runnable. For the latter the jury is still out whether this should be done or just lazily defered like the SCHED_OTHER preemption requests. In any case for that to matter this forced preemption would need to preempt a RCU read side critical section and then keep the preempted task away from the CPU for a long time. That's very different from the unconditional kernel preemption model which preempt=full provides and only marginally different from the existing PREEMPT_NONE model. I know there might be dragons, but I'm not convinced yet that this is an actual problem. OTOH, doesn't PREEMPT_RCU=y have mechanism to mitigate that already? > Essentially this means preemption is now more aggressive from the point of view > of a preempt=none user. I was suggesting that, a point of view could be RCU > should always support preepmtiblity (don't give PREEEMPT_RCU=n option) because > AUTO *does preempt* unlike classic CONFIG_PREEMPT_NONE. Otherwise it is > inconsistent -- say with CONFIG_PREEMPT=y (another *preemption mode*) which > forces CONFIG_PREEMPT_RCU. However to Paul's point, we need to worry about those > users who are concerned with running out of memory due to reader > preemption. What's wrong with the combination of PREEMPT_AUTO=y and PREEMPT_RCU=n? Paul and me agreed long ago that this needs to be supported. > In that vain, maybe we should also support CONFIG_PREEMPT_RCU=n for > CONFIG_PREEMPT=y as well. There are plenty of popular systems with relatively > low memory that need low latency (like some low-end devices / laptops > :-)). I'm not sure whether that's useful as the goal is to get rid of all the CONFIG_PREEMPT_FOO options, no? I'd rather spend brain cycles on figuring out whether RCU can be flipped over between PREEMPT_RCU=n/y at boot or obviously run-time. Thanks, tglx