On Tue, Dec 15, 2009 at 12:21:07AM +0100, Andi Kleen wrote: > Salman Qazi <sqazi@xxxxxxxxxx> writes: > > > > We'd like to get as much of our stuff upstream as we can. Given that > > this is a somewhat sizable chunk of work, it would be impolite of me > > to just send out a bunch of patches without hearing the concerns of > > the community. What are your thoughts on our design and what do we > > need to change to get this to be more acceptable to the community? I > > also would like to know if there are any existing pieces of > > infrastructure that this can utilize. > > There were a lot of discussions on this a few months ago in context > of the ACPI 4 "power aggregator" which is a similar (perhaps > slightly less sophisticated) concept. > > While there was a lot of talk about teaching the scheduler about this > the end result was just a driver which just starts real time threads > and then idles in them. This is in current mainline. > > It might be a good idea to review these discussions in the archives. It should be noted that most of the heat from those discussions was over adding the ACPI 4 mechanism to accept requests from the hardware platform to add idle cycles in the case of thermal/power emergencies, before we had the scheduler improvements to be able to do so in the most efficient way possible. See the description of commit 8e0af5141: ACPI 4.0 created the logical "processor aggregator device" as a mechinism for platforms to ask the OS to force otherwise busy processors to enter (power saving) idle. The intent is to lower power consumption to ride-out transient electrical and thermal emergencies, rather than powering off the server.... Vaidyanathan Srinivasan has proposed scheduler enhancements to allow injecting idle time into the system. This driver doesn't depend on those enhancements, but could cut over to them when they are available. Peter Z. does not favor upstreaming this driver until the those scheduler enhancements are in place. However, we favor upstreaming this driver now because it is useful now, and can be enhanced over time. It looks to me that scheme that Salman has proposed for adding idle cycles is quite sophisticated, probably more than Vaidyanathan's, and the main difference is that Google wants the ability to be able to control the system's power/thermal envelope from userspace, as opposed to letting the hardware request in an emergency situation. This makes sense, if you are trying to balance the power/thermal requirements across a large number of systems, as opposed to responding to a local power/thermal emergency signalled from the platform's firmware. So it would seem to me that Salman's suggestions are very similar to what Peter requested before this commit went in (over his objections). Regards, - Ted _______________________________________________ linux-pm mailing list linux-pm@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/linux-pm