On 27 November 2012 14:59, Steven Rostedt <rostedt@xxxxxxxxxxx> wrote: > On Tue, 2012-11-27 at 19:18 +0530, Viresh Kumar wrote: >> On 27 November 2012 18:56, Steven Rostedt <rostedt@xxxxxxxxxxx> wrote: >> > A couple of things. The sched_select_cpu() is not cheap. It has a double >> > loop of domains/cpus looking for a non idle cpu. If we have 1024 CPUs, >> > and we are CPU 1023 and all other CPUs happen to be idle, we could be >> > searching 1023 CPUs before we come up with our own. >> >> Not sure if you missed the first check sched_select_cpu() >> >> +int sched_select_cpu(unsigned int sd_flags) >> +{ >> + /* If Current cpu isn't idle, don't migrate anything */ >> + if (!idle_cpu(cpu)) >> + return cpu; >> >> We aren't going to search if we aren't idle. > > OK, we are idle, but CPU 1022 isn't. We still need a large search. But, > heh we are idle we can spin. But then why go through this in the first > place ;-) By migrating it now, it will create its activity and wake up on the right CPU next time. If migrating on any CPUs seems a bit risky, we could restrict the migration on a CPU on the same node. We can pass such contraints on sched_select_cpu > > >> >> > Also, I really don't like this as a default behavior. It seems that this >> > solution is for a very special case, and this can become very intrusive >> > for the normal case. >> >> We tried with an KCONFIG option for it, which Tejun rejected. > > Yeah, I saw that. I don't like adding KCONFIG options either. Best is to > get something working that doesn't add any regressions. If you can get > this to work without making *any* regressions in the normal case than > I'm totally fine with that. But if this adds any issues with the normal > case, then it's a show stopper. > >> >> > To be honest, I'm uncomfortable with this approach. It seems to be >> > fighting a symptom and not the disease. I'd rather find a way to keep >> > work from being queued on wrong CPU. If it is a timer, find a way to >> > move the timer. If it is something else, lets work to fix that. Doing >> > searches of possibly all CPUs (unlikely, but it is there), just seems >> > wrong to me. >> >> As Vincent pointed out, on big LITTLE systems we just don't want to >> serve works on big cores. That would be wasting too much of power. >> Specially if we are going to wake up big cores. >> >> It would be difficult to control the source driver (which queues work) to >> little cores. We thought, if somebody wanted to queue work on current >> cpu then they must use queue_work_on(). > > As Tejun has mentioned earlier, is there any assumptions anywhere that > expects an unbounded work queue to not migrate? Where per cpu variables > might be used. Tejun had a good idea of forcing this to migrate the work > *every* time. To not let a work queue run on the same CPU that it was > queued on. If it can survive that, then it is probably OK. Maybe add a > config option that forces this? That way, anyone can test that this > isn't an issue. > > -- Steve > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html