Hello, Peter. On Mon, Jun 24, 2024 at 02:40:53PM +0200, Peter Zijlstra wrote: > On Wed, May 01, 2024 at 05:09:53AM -1000, Tejun Heo wrote: > > BPF schedulers might not want to schedule certain tasks - e.g. kernel > > threads. This patch adds p->scx.disallow which can be set by BPF schedulers > > in such cases. The field can be changed anytime and setting it in > > ops.prep_enable() guarantees that the task can never be scheduled by > > sched_ext. > > Why ?!?! > > By leaving kernel threads fair, and fair sitting above the BPF thing, > it is not dissimilar to promoting them to FIFO. They will instantly > preempt the BPF thing and keep running for as long as they need. The > only real difference between this and actual FIFO is the behaviour on > contention. Yes, from sched_ext's POV, in partial mode, CFS isn't all that different from FIFO. Whenever there are tasks to run in CFS, CPUs are taken away. Right now, partial mode can be useful for leaving a part of system on CFS (e.g. in a cpuset partitioned system), when the scheduler is narrowly focused and doesn't cover everything necessary (e.g. EAS). > This seems like a very bad thing to have, and your 'changelog' has no > justification what so ever. This is a bit of duplicate interface in that in partial mode sched_ext can already be opted in by setting per-thread sched class. However, some use cases wanted this so that the BPF scheduler has the final say over who can be on it rather than the userspace. It's a convenience feature for some use cases. Thanks. -- tejun