Re: [PATCH]new ACPI processor driver to force CPUs idle

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



* Shaohua Li <shaohua.li@xxxxxxxxx> [2009-06-24 16:21:12]:

> On Wed, Jun 24, 2009 at 04:03:05PM +0800, Peter Zijlstra wrote:
> > On Wed, 2009-06-24 at 15:47 +0800, Shaohua Li wrote:
> > > On Wed, Jun 24, 2009 at 02:39:18PM +0800, Peter Zijlstra wrote:
> > > > On Wed, 2009-06-24 at 12:13 +0800, Shaohua Li wrote:
> > > > > This patch supports the processor aggregator device. When OS gets one ACPI
> > > > > notification, the driver will idle some number of cpus.
> > > > > 
> > > > > To make CPU idle, the patch will create power saving thread. Scheduler
> > > > > will migrate the thread to preferred CPU. The thread has max priority and
> > > > > has SCHED_RR policy, so it can occupy one CPU. To save power, the thread will
> > > > > keep calling C-state instruction. Routine power_saving_thread() is the entry
> > > > > of the thread.
> > > > > 
> > > > > To avoid starvation, the thread will sleep 5% time for every second
> > > > > (current RT scheduler has threshold to avoid starvation, but if other
> > > > > CPUs are idle, the CPU can borrow CPU timer from other, so makes the mechanism
> > > > > not work here)
> > > > > 
> > > > > This approach (to force CPU idle) should hasn't impact to scheduler and tasks
> > > > > with affinity still can get chance to run even the tasks run on idled cpu. Any
> > > > > comments/suggestions are welcome.
> > > > 
> > > > > +static int power_saving_thread(void *data)
> > > > > +{
> > > > > +	struct sched_param param = {.sched_priority = MAX_RT_PRIO - 1};
> > > > > +	int do_sleep;
> > > > > +
> > > > > +	/*
> > > > > +	 * we just create a RT task to do power saving. Scheduler will migrate
> > > > > +	 * the task to any CPU.
> > > > > +	 */
> > > > > +	sched_setscheduler(current, SCHED_RR, &param);
> > > > > +
> > > > 
> > > > This is crazy and wrong.
> > > > 
> > > > 1) cpusets can be so configured as to not have the full machine in a
> > > > single load-balance domain, eg. the above comment about the scheduler is
> > > > false.
> > > Assume user will not assign such thread to a cpuset, if yes, it's user's
> > > wrong.
> > 
> > No its user policy, and esp on large machines cpusets are very useful.
> > The kernel not taking that into account is simply not an option.
> > 
> > Any thermal facility that doesn't take cpusets into account, or worse
> > destroys user policy (the hotplug road), is a full stop in my book.
> > 
> > Is similar to the saying the customer is always right, sure the admin
> > can indeed configure the machine so that any thermal policy is indeed
> > doomed to fail, and in that case I would print some warnings into syslog
> > and let the machine die of thermal overload -- not our problem.
> > 
> > The thing is, the admin configures it in a way, and then expects it to
> > work like that. If any random event can void the guarantees what good
> > are they?
> > 
> > Now, if ACPI-4.0 is so broken that it simply cannot support a sane
> > thermal model, then I suggest we simply not support this feature and
> > hope they will grow clue for 4.1 and try again next time.
> The assumption is user not assigns power saving thread to a specific cpuset.
> I thought the assumption is feasible, user can assign threads they care about
> to a cpuset, but not all.
> Power saving thread stays at the top cpuset, so it still has chance to run on any
> cpus. If power saving thread runs on a cpu, the tasks on the cpu still have chance
> to run (at least 0.05s), so it does not completely break user policy.

How do we handle interrupts and timers during this interval?  You seem
to disable interrupts and hold the cpu at idle for 0.95 sec.  It may
cause timeouts and overflows for network interrupts right?

Next issue is halting sibling threads belonging to a core at the same
time to have any power/thermal benefit.  Who does the coordination for
forced idle in this approach?

--Vaidy

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux IBM ACPI]     [Linux Power Management]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux