On Thu, Nov 19, 2020 at 12:51:32PM +0100, Peter Zijlstra wrote: > On Wed, Nov 18, 2020 at 08:48:43PM +0100, Thomas Gleixner wrote: > > > @@ -4073,6 +4089,7 @@ prepare_task_switch(struct rq *rq, struc > > perf_event_task_sched_out(prev, next); > > rseq_preempt(prev); > > fire_sched_out_preempt_notifiers(prev, next); > > + kmap_local_sched_out(); > > prepare_task(next); > > prepare_arch_switch(next); > > } > > @@ -4139,6 +4156,7 @@ static struct rq *finish_task_switch(str > > finish_lock_switch(rq); > > finish_arch_post_lock_switch(); > > kcov_finish_switch(current); > > + kmap_local_sched_in(); > > > > fire_sched_in_preempt_notifiers(current); > > /* > > > +void __kmap_local_sched_out(void) > > +{ > > + struct task_struct *tsk = current; > > + pte_t *kmap_pte = kmap_get_pte(); > > + int i; > > + > > + /* Clear kmaps */ > > + for (i = 0; i < tsk->kmap_ctrl.idx; i++) { > > + } > > +} > > + > > +void __kmap_local_sched_in(void) > > +{ > > + struct task_struct *tsk = current; > > + pte_t *kmap_pte = kmap_get_pte(); > > + int i; > > + > > + /* Restore kmaps */ > > + for (i = 0; i < tsk->kmap_ctrl.idx; i++) { > > + } > > +} > > So even in the optimal case, this adds an unconditional load of > tsk->kmap_ctrl.idx to schedule() (2 misses, one pre and one post). > > Munging preempt-notifiers behind a static_branch, which in that same > optimal case, avoided touching curr->preempt_notifier, resulted in a > measurable performance improvement. See commit: > > 1cde2930e154 ("sched/preempt: Add static_key() to preempt_notifiers") > > Can we fudge some state in a cacheline we're already touching to avoid > this? The only state we seem to consistently look at after schedule() is need_resched()'s TIF_NEED_RESCHED. But adding a TIF_flag to all archs and setting/clearing it from kmap_local might be a bit daft.. :/