On Tue, Dec 13, 2022 at 10:40 AM Rik van Riel <riel@xxxxxxxxxxx> wrote: > > On Tue, 2022-12-13 at 08:12 -1000, Tejun Heo wrote: > > Hello, > > > > On Tue, Dec 13, 2022 at 11:55:10AM +0100, Peter Zijlstra wrote: > > > On Mon, Dec 12, 2022 at 11:33:12AM -1000, Tejun Heo wrote: > > > > > > > Here, the way it's handled is a bit different, SCX has > > > > a watchdog mechanism implemented in "[PATCH 18/31] sched_ext: > > > > Implement > > > > runnable task stall watchdog", so if SCX tasks hang for whatever > > > > reason > > > > including being starved by CFS, it will get aborted and all tasks > > > > will be > > > > handed back to CFS. IOW, it's treated like any other BPF > > > > scheduler errors > > > > that can lead to stalls and recovered the same way. > > > > > > That all sounds quite terrible.. :/ > > > > The main source of difference is that we can't implicitly trust the > > BPF > > scheduler and if it malfunctions or on user request, the system > > should > > always be recoverable, so there are some extra things which are > > inherently > > necessary to support that. > > > That makes me wonder whether loading an SCX policy > should just have that policy take over all of the > SCHED_OTHER tasks by default, and have a failure of > the policy just return those tasks to CFS? > > Having the two be operative at the same time seems > to be a cause of hard to resolve issues, while simply > running all non-RT tasks under the loadable policy > could simplify both internal kernel interfaces, as > well as externally visible effects? There are reasons to want to still have CFS available even when SCX is loaded. For example, on a partitioned shared tenant machine, moving one application to an SCX policy without needing to move everyone. Or, wanting to avoid scheduling things like kthreads under an SCX policy, since for example that makes an SCX policy writer need not only consider the needs of application threads, but also those of kernel threads.