* Steven Rostedt <srostedt@xxxxxxxxxx> wrote: > (added Rusty) > > On Mon, 2009-01-19 at 13:04 -0500, Chris Mason wrote: > > On Thu, 2009-01-15 at 00:11 -0700, Ma, Chinang wrote: > > > >> > > > > > > > >> > > > > Linux OLTP Performance summary > > > >> > > > > Kernel# Speedup(x) Intr/s CtxSw/s us% sys% idle% > > > >iowait% > > > >> > > > > 2.6.24.2 1.000 21969 43425 76 24 0 > > > >0 > > > >> > > > > 2.6.27.2 0.973 30402 43523 74 25 0 > > > >1 > > > >> > > > > 2.6.29-rc1 0.965 30331 41970 74 26 0 > > > >0 > > > >> > > > > >> > > But the interrupt rate went through the roof. > > > >> > > > > >> > Yes. I forget why that was; I'll have to dig through my archives for > > > >> > that. > > > >> > > > >> Oh. I'd have thought that this alone could account for 3.5%. > > > > A later email indicated the reschedule interrupt count doubled since > > 2.6.24, and so I poked around a bit at the causes of resched_task. > > > > I think the -rt version of check_preempt_equal_prio has gotten much more > > expensive since 2.6.24. > > > > I'm sure these changes were made for good reasons, and this workload may > > not be a good reason to change it back. But, what does the patch below > > do to performance on 2.6.29-rcX? > > > > -chris > > > > diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c > > index 954e1a8..bbe3492 100644 > > --- a/kernel/sched_rt.c > > +++ b/kernel/sched_rt.c > > @@ -842,6 +842,7 @@ static void check_preempt_curr_rt(struct rq *rq, > > struct task_struct *p, int sync > > resched_task(rq->curr); > > return; > > } > > + return; > > > > #ifdef CONFIG_SMP > > /* > > That should not cause much of a problem if the scheduling task is not > pinned to an CPU. But!!!!! > > A recent change makes it expensive: > > commit 24600ce89a819a8f2fb4fd69fd777218a82ade20 > Author: Rusty Russell <rusty@xxxxxxxxxxxxxxx> > Date: Tue Nov 25 02:35:13 2008 +1030 > > sched: convert check_preempt_equal_prio to cpumask_var_t. > > Impact: stack reduction for large NR_CPUS > > > > which has: > > static void check_preempt_equal_prio(struct rq *rq, struct task_struct > *p) > { > - cpumask_t mask; > + cpumask_var_t mask; > > if (rq->curr->rt.nr_cpus_allowed == 1) > return; > > - if (p->rt.nr_cpus_allowed != 1 > - && cpupri_find(&rq->rd->cpupri, p, &mask)) > + if (!alloc_cpumask_var(&mask, GFP_ATOMIC)) > return; > > > > > check_preempt_equal_prio is in a scheduling hot path!!!!! > > WTF are we allocating there for? Agreed - this needs to be fixed. Since this runs under the runqueue lock we can have a temporary cpumask in the runqueue itself, not on the stack. Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html