On Thu, 12 Dec 2019 at 11:23, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > > On Thu, Dec 12, 2019 at 11:14:24AM +0100, Peter Zijlstra wrote: > > On Thu, Dec 12, 2019 at 11:10:31AM +0100, Peter Zijlstra wrote: > > > @@ -4156,13 +4159,13 @@ check_preempt_tick(struct cfs_rq *cfs_rq, struct sched_entity *curr) > > > if (delta_exec < sysctl_sched_min_granularity) > > > return; > > > > > > - se = __pick_first_entity(cfs_rq); > > > + se = __pick_next_entity(cfs_rq, NULL); > > > delta = curr->vruntime - se->vruntime; > > > > > > if (delta < 0) > > > return; > > > > What I mean with the below comment is, when this isn't enough, try > > something like: > > > > if (se == cfs_rq->next) > > ideal_runtime /= 2; > > > > to make it yield sooner to 'next' buddies. Sadly, due to the whole > > cgroup mess, we can't say what actual task is on the end of it (without > > doing a full hierarchy pick, which is more expensive still). > > Just for giggles, that'd look something like: > > while (!entity_is_task(se) { > cfs_rq = group_cfs_rq(se); > se = pick_next_entity(cfs_rq, cfs_rq->curr); > } > p = task_of(se); > > if (is_per_cpu_kthread(p)) > ideal_runtime /= 2; > > the core-scheduling patch set includes the right primitive for this I > think, pick_task_fair(). why not only updating wan_gran() which is the only function which uses sysctl_sched_wakeup_granularity ? For per cpu kthread, we could set the gran to sched_min_granularity instead of scaling it with thread's priority so per cpu kthread can still preempt current ask even if sysctl_sched_wakeup_granularity is large > > > > - if (delta > ideal_runtime) > > > + if (delta > ideal_runtime) // maybe frob this too ? > > > resched_curr(rq_of(cfs_rq)); > > > }