Hello, On Mon, Dec 12, 2022 at 01:31:11PM +0100, Peter Zijlstra wrote: > On Tue, Nov 29, 2022 at 10:22:56PM -1000, Tejun Heo wrote: > > @@ -11242,3 +11268,38 @@ void call_trace_sched_update_nr_running(struct rq *rq, int count) > > { > > trace_sched_update_nr_running_tp(rq, count); > > } > > + > > +#ifdef CONFIG_SCHED_CLASS_EXT > > +void sched_deq_and_put_task(struct task_struct *p, int queue_flags, > > + struct sched_enq_and_set_ctx *ctx) > > +{ > > + struct rq *rq = task_rq(p); > > + > > + lockdep_assert_rq_held(rq); > > + > > + *ctx = (struct sched_enq_and_set_ctx){ > > + .p = p, > > + .queue_flags = queue_flags | DEQUEUE_NOCLOCK, > > + .queued = task_on_rq_queued(p), > > + .running = task_current(rq, p), > > + }; > > + > > + update_rq_clock(rq); > > + if (ctx->queued) > > + dequeue_task(rq, p, queue_flags); > > + if (ctx->running) > > + put_prev_task(rq, p); > > +} > > + > > +void sched_enq_and_set_task(struct sched_enq_and_set_ctx *ctx) > > +{ > > + struct rq *rq = task_rq(ctx->p); > > + > > + lockdep_assert_rq_held(rq); > > + > > + if (ctx->queued) > > + enqueue_task(rq, ctx->p, ctx->queue_flags); > > + if (ctx->running) > > + set_next_task(rq, ctx->p); > > +} > > +#endif /* CONFIG_SCHED_CLASS_EXT */ > > So no. Like the whole __setscheduler_prio() thing, this doesn't make > sense outside of the core code, policy/class code should never need to > do this. Continuing from the __setscheduler_prio() discussion, the need arises from the fact that whether a task is on SCX or CFS changes depending on whether the BPF scheduler is loaded or not - e.g. when the BPF scheduler gets disabled, all tasks that were on it need to be moved back into CFS. There are different ways to implement this but it needs to be solved somehow. > Also: https://lkml.kernel.org/r/20220330162228.GH14330@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Yeah, it'd be nice to encapsulate this sequence. The FOR_CHANGE_GUARD naming throws me off a bit tho as it's not really a loop. Wouldn't it be better to call it CHANGE_GUARD_BLOCK or sth? Thanks. -- tejun