* Arun R Bharadwaj <arun@xxxxxxxxxxxxxxxxxx> wrote: > Hi, > > > In an SMP system, tasks are scheduled on different CPUs by the > scheduler, interrupts are managed by irqbalancer daemon, but > timers are still stuck to the CPUs that they have been > initialised. Timers queued by tasks gets re-queued on the CPU > where the task gets to run next, but timers from IRQ context > like the ones in device drivers are still stuck on the CPU > they were initialised. This framework will help move all > 'movable timers' from one CPU to any other CPU of choice using > a sysfs interface. hm, the intention is good, the concept of migrating timers to their target CPU is good as well. We already do some of that for regular timers. But the whole sysfs interface you implemented here is not particularly clean nor is it efficient. The main problem is that timers are really fast-moving entities, and so are the tasks they are related to. Your implementation completely ties the direction of migration (the timer scheduling) to a clumsy sysfs interface: + if (sscanf(buf, "%d", &target_cpu) && cpu_online(target_cpu)) { + ret = count; + per_cpu(enable_timer_migration, cpu->sysdev.id) = target_cpu; + } That doesnt really scale and i doubt it works in practice. We should not schedule timers via sysfs, we should let the kernel do it auomatically. [*] So what i'd suggest instead is extend the scheduler power-saving code, which already identifies a 'load balancer CPU', to also attract all attractable sources of timers - automatically. See the 'load_balancer' CPU logic in kernel/sched.c. Does that sound OK to you? I think the end result might even give better numbers - and out of box. I'd also suggest to not do that rather ugly enable_timer_migration per-cpu variable, but simply reuse the existing nohz.load_balancer as a target CPU. Also, please base your patches on the latest timer tree (which already modified some of this code in this cycle): http://people.redhat.com/mingo/tip.git/README Btw., could you please also fix your mailer to not do this to us: Mail-Followup-To: linux-kernel@xxxxxxxxxxxxxxx, linux-pm@xxxxxxxxxxxxxxxxxxxxxxxxxx, a.p.zijlstra@xxxxxxxxx, ego@xxxxxxxxxx, tglx@xxxxxxxxxxxxx, mingo@xxxxxxx, andi@xxxxxxxxxxxxxx, venkatesh.pallipadi@xxxxxxxxx, vatsa@xxxxxxxxxxxxxxxxxx, arjan@xxxxxxxxxxxxx it messes up the replies. Ingo [*] IRQ migration (where you possibly got the sysfs idea from) is a special case where 'slow scheduling' via a user-space daemon is possible: they are an external source of events and they are concentrators of work. The same concept does not apply to timers, most of which are inherently task-generated. _______________________________________________ linux-pm mailing list linux-pm@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/linux-pm