Re: Ottawa Linux Power Management Summit, June 25-26, 2007 - Minutes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wednesday, 5 September 2007 10:26, Len Brown wrote:
> A Linux Power Management "mini-summit" was held in Ottawa
> on June 25 and 26, 2007, immediately preceeding the Ottawa Linux Symposium.
> 
> An effort was made to follow the best-known-method
> for a Linux mini-summit, thought to be the most recent
> storage-summit.  The invitation to the meeting was open --
> sent to linux-pm@xxxxxxxxxxxxxxxxxxxxxxxxxx in early May.
> The focus of the meeting was on technical discussion.  Thus,
> only presentations which supported discussion were encouraged,
> and the size of the forum was capped at 20.  The agenda was set
> by consensus of the attendees.
> 
> Thank you to the Intel Open Source Technology Center
> for sponsoring the meeting.
> 
> Day 1 attendees:
> 
> Len Brown, Intel OTC, Linux Kernel ACPI Maintainer
> Mark Gross, Intel OTC, embedded Linux team
> Paul Mundt, Renesas, Linux Kernel Super-H Maintainer
> Kevin Hilman, MontaVista, MV DPM Maintainer
> Igor Stoppa, Nokia, OSSO Power Management
> Sakari Poussa, Nokia, OSSO Power Management
> Dave Jones, Red Hat, Fedora Maintainer, Linux Kernel Cpufreq Maintainer
> Klaus Pedersen, Nokia, OSSO Power Management
> Ken Rozendal, IBM, Linux on Power
> Vivek Kashyap, IBM LTC
> Adam Belay, Novell/MIT, cpuidle developer
> Eugeny S. Mints, NGS Power Management
> Scott E. Preece, Motorola
> Marcelo Tosatti, Red Hat, One Laptop Per Child
> 
> Day 2 additional attendees:
> 
> Tariq Shureih, Intel OTC, MID power policy manager
> Rishi Bhattacharya, Texas Instruments
> Iliasbiris, Instituto de Tecnologia
> 
> notes:
> 
> Mark Gross showed off a Classmate PC.  The unit he had was a 900MHz
> Celeron (model 13) Find out more at http://classmatepc.com
> 
> Mark led a discussion about constraints/quality of service.
> An application specifies a QOS/SLA to some middle-ware, which
> translates that into operation constraints.  We discussed the
> vocabulary for constraints.  More on this below.
> 
> Igor Stoppa presented findings from the Nokia Tablet team.
> The OMAP1 used in the n770 had idle/big-sleep/deep-sleep.
> The OMAP2 is used in the n800, is built on 90nm technology.
> The OMAP3 is expected to be built on high leakage 65nm technology,
> and thus require software to take advantage of power-gating off states.
> Indeed, the OMAP3 has over 30 power gates.
> 
> http://linux.omap.com has OMAP Linux resources.
> http://source.mvista.com hosts OMAP patches before they get
> to kernel.org
> 
> Re: Performance States
> 
> Igor asserted that once a voltage is selected, it is it always
> the best policy to run at the maximum frequency supported by
> that voltage.
> 
> However, the OMAP2 throws Linux a curve ball when increasing
> the ARM core to its maximum speed, it will _reduce_ the speed of
> the DSP.  Eg. 400MHz and 133MHz respectively.  cpufreq doesn't
> have a concept of this kind of dependency.
> 
> cpufreq_set_policy() doesn't match Nokia's needs as it is a 1-way
> notification, and there is no way to register constraints.
> 
> Igor reported a scaling frequency bug where the current polling
> interval and minimum residency formulas in ondemand don't work
> on Nokia's hardware.
> 
> He also described "spread to deadline" in contrast to "race to
> idle".  In spread-to-deadline, the work is run at the minimum rate
> such that it will complete in time for a known future deadline.
> The deadline might be an expected external periodic communication
> event, for example.
> 
> Re: pause/resume
> Total pause/resume on the n800 is 20-80ms.
> PLL re-lock takes about 0.1ms and the voltage ramp is about 5ms
> by comparison.  The big time consumer is drivers.  In particular
> syncing with screen updates.
> 
> Paul Mundt contrasted the clock framework with cpufreq, saying
> that one could build a rate table of all P-state transitions.
> Though this would need to prototyped to see if it is viable.
> 
> Marcelo Tosatti shoed off an OLPC XO-1 (http://laptop.org/)
> It includes a 433MHz AMD Geode LX.
> (this replaced the previous cache-less Geode GX)
> The XO-1 has 1G NAND flash 1200x900 LED screen which uses 0.2W min,
> 1.0 Watts max.  These screen power numbers are truly impressive.
> 
> OLPC wants to aggressively auto-suspend to an suspend-to-RAM
> like state, except the screen stays on (and wireless stays on).
> The system wakes upon user-input.  The requirement for this state
> is < 100ms resume latency.  Jim Gettys asserts that the iPAQ could
> resume in 10ms by comparison.  Marcelo reports that the XO-1
> can resume in 160ms today if USB is disabled.  However, if USB
> is enabled, it resumes in 250ms.  He thinks that resume needs to
> be multi-threaded, and it needs to be smarter so that it doesn't
> blindly resume every device in the system.
> 
> XO-1 has a Display Controller (DCON), which will refresh display
> even when processor completely powered off.
> 
> Regarding wake, enable_irqwake(irq) is ugly b/c it is IRQ specific.
> Needs to e enable-wakeup(device) -- a generic API.
> 
> Audio amplifier must delay ~100ms power-up to avoid a pop.
> 
> OLPC is not using suspend-to-disk, yet.
> 
> Discussed the STD vs STR path.  The expectation is that STR can be
> faster if it doesn't follow the same path as STD.  Per the list,
> Rafael is working on this.

Well, in 2.6.23 the hibernation (STD is a PCish name) and suspend (ie.
STR, standby, etc.) code paths will be separate on the highest level.  Still,
they both use the freezer and device_suspend()/device_resume() , which consume
the majority of the suspend/resume time.

> OLPC is using OHM - Open HW Manager -- a generic system manager,
> of which power management is just one part.
> 
> olpc-pm.c olpc_pm_enter() is kicked off by OHM on detecting idle.
> 
> Dave Jones led a discussion on cpufreq.
> 
> Re: Accounting vs cpufreq.
> Enterprise capacity planning applications get confused by cpufreq.
> cpufreq lowers the MHz due to low demand, the management application
> sees no idle time left -- indicating that the system has reached capacity
> and need to be upgraded.
> 
> Dave commented that the cpufreq conservative governor should
> be deleted and whatever hooks are needed should simply be added
> to ondemand.
> 
> MHz vs scheduler: today cpufreq simply tracks idle time and the
> schedule is completely unaware that cpufreq changes the frequency.
> Application hints may be appropriate for apps to tell the scheduler
> about their MHz needs.  Also, the scheduler may be better off
> scheduling cycles instead of scheduling time.
> 
> Discussion on APERF/MPERF MSRs on Intel processors: The APERF/MPERF
> ratio conveys the "actual" to "maximum" MHz ratio since the
> MSRs were last reset.  Note that with Intel Dynamic Acceleration
> (IDA), this ratio can be greater than 1 -- so maybe "maximum"
> needs to be re-worded as "marketing":-)
> 
> governors It isn't clear whey there needs to be a governor
> per core.  It seems to be unused today, except on incorrectly
> administered systems.
> 
> user-space: cpuspeed, powernowd not used so much these days.
> 
> The fabled DPM/PowerOP/cpufreq integration isn't happening fast.
> Per previous discussion, an abstract notion Operating Points
> makes the most sense, and perhaps dealing in units of absolute
> MHz is not the right model.  Though users are now accustomed to
> thinking they know the absolute MHz....
> 
> Dave Jones was open to the idea of transforming cpufreq into a
> generic clock scaling implementation.
> 
> Dave mentioned that Fedora Core 7 32-bit is now shipping with
> CONFIG_NOHZ=y and CONFIG_HZ=1000.

CONFIG_NOHZ is known to break suspend and resume on some machines.  These
problems are being fixed over time, but that's a risky decision for a
distribution to switch it on by default.

> Kevin Hillman led a discussion on DPM (Dynamic Power Management,
> http://dynamicpower.sourceforge.net/)
> 
> DPM has been shipping since Linux-2.4 and is a part of many
> successful products, so it will continue to be supported.
> 
> One key aspect of DPM is that it allows customers to put their
> platform-specific proprietary control code in user-space.
> 
> DPM has hooks in the scheduler where applications explicitly
> request an operating state.
> 
> MontaVista is hoping to migrate to mainline, now that mainline is
> becoming more capable.  In particular, they need solid tickless,
> cpufreq, and wake-up events.
> 
> Paul Mundt described the cutting edge in the Super-H space.
> The SH4A-SMP has 4 cores and it expected to be used in high-end
> consumer electronics, navigation etc.  It has per-core voltage
> regulation, and CPU offline saves real power.  Often ITRON is
> run on a core.
> 
> Mark Gross led a discussion on Device QOS Parameters, to see
> if common language might be suitable, say in a sysfs interface.
> We brain-stormed on how throughput, rate, power gain, latency,
> acoustic and timeout applied to various classes of devices;
> such as storage, wired and wireless networks, and the display.
> 
> Suspend/Resume:
> Earlier on the list, Linus stated that he might
> prefer multiple entry points that do simpler functions rather
> than the over-loaded .suspend/.resume I/F we have today.
> 
> Adam Belay described a 2-pass device suspend to ram loop, where .stop is
> first called for each device before the first .suspend is called:
> 
> .start .stop
>         dont touch hardware able to return failure
> .suspend(target state)
>         saves HW state enable wake feature invoke D-state
>         (power-off)
> [take STD snapshot here] .resume
> 
> There is also a .reset especially for kexec that can be called
> after .stop.  It removes the IRQ and int src.

I think we'll need some more callbacks than that.  For example, we may need to
add a prepare_to_stop() callback allowing the driver to allocate additional
memory etc. before .stop() is called.

> The .stop loop allows a device to veto the suspend and for the
> system to quickly back out of the operation.

If we want to remove the freezer, we may want to use .stop() to make the driver
start blocking I/O data going from processes to the device and the other way
around.

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

_______________________________________________
linux-pm mailing list
linux-pm@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linux-foundation.org/mailman/listinfo/linux-pm

[Index of Archives]     [Linux ACPI]     [Netdev]     [Ethernet Bridging]     [Linux Wireless]     [CPU Freq]     [Kernel Newbies]     [Fedora Kernel]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux