Hi Ingo, On 2023-09-25 at 13:07:08 +0200, Ingo Molnar wrote: > > Anyway, it's clear from these results that while many workloads hurt > from our notion of wake-preemption, there's several ones that benefit > from it, especially generic ones like phoronix-test-suite - which have > no good way to turn off wakeup preemption (SCHED_BATCH might help though). > > One way to approach this would be to instead of always doing > wakeup-preemption (our current default), we could turn it around and > only use it when it is clearly beneficial - such as signal delivery, > or exec(). > > The canonical way to solve this would be give *userspace* a way to > signal that it's beneficial to preempt immediately, ie. yield(), > but right now that interface is hurting tasks that only want to > give other tasks a chance to run, without necessarily giving up > their own right to run: > > se->deadline += calc_delta_fair(se->slice, se); > Do you mean we want to give hint to the scheduler that, in most cases the tasks do not want to be wakeup-preempted, but some other tasks want to keep current wakeup-preempt strategy? If we want the current task to be preempted easier, we have to shrink the current task's slice, which gives up its own right to run. If this is the case, how about introducing negative se->slice, tasks has a negative slice indicates that it wants to be wakeup-preempted, and its real slice is the absolute value of se->slice. diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index f7c0d9cad6e0..019576da9737 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -8120,7 +8120,7 @@ static void check_preempt_wakeup(struct rq *rq, struct task_struct *p, int wake_ * Batch and idle tasks do not preempt non-idle tasks (their preemption * is driven by the tick): */ - if (unlikely(p->policy != SCHED_NORMAL) || !sched_feat(WAKEUP_PREEMPTION)) + if (unlikely(p->policy != SCHED_NORMAL) || || se->slice > 0 || !sched_feat(WAKEUP_PREEMPTION)) return; find_matching_se(&se, &pse); thanks, Chenyu