* Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > On Fri, 4 Jun 2010, Ingo Molnar wrote: > > > What you say is absolutely true, hence this would be driven via > > sched_tick() + TIF notifiers - i.e. only ever treat user-mode tasks as > > 'idle-able'. This can be done with no overhead to the regular fastpaths. > > > > The TIF notifier would be the one scheduling to idle - and would thus do > > it only to user-mode tasks. > > The thing is, unless there is some _really_ deep other reason to do > something like this, I still think it's total overdesign to push any > knowledge/choices like this into the scheduler. I'd rather keep things way > more independent, less tied to each other and to deep kernel subsystems. Well, the deep reason as i see it is simply the observation that what the Android auto-suspend code implements via the suspend-blocker patches is an idle driver and user-space scheduler in disguise. (if you count that as a deep enough reason) I dont mind hacks if they are local and if i dont have to maintain them, but the objection from other folks was that suspend blockers are not that local and not that maintainable. And if (and that's a big if) we have a global effect anyway, then we might as well consider implementing it cleanly: - A global /sys flag is fundamentally racy and only allows a single user-space actor. Not a problem on mobile phones but sure violates taste buds. Proper per task latency attributes are not racy - we always know the maximum/minimum values, without user-space interfering with each other. - When done correctly we might win a couple of new features as well around the fringes: - Useful for power savings on mobile: crappy apps can be idled on an intermediate level, even before the system goes totally idle. There's no equivalent suspend-blockers feature. - Useful for real-time tasks that want to idle lower prio tasks when some really important thing is running - even if the real-time task might sleep. This is superior to the 'hog the CPU' kind of hacks that have been used for this purpose before. - The hacks needed to express a race-free suspend/wakeup cycle are unnatural and stem from the model being a user-space driven idle manager instead of a proper part of task sleep/wakeup. - None of this code seems to impact any scheduler hotpath (most of it is just a special form of idle driver) - it's all on deeper levels of idle and, at most, in off-line return-to-userspace codepaths. So there's no strong performance reason _against_ some level of integration. There is indeed the coupling effect as you mention, which weighs against. - i also think Andoid's auto-suspend is a strategic feature to Linux: i think auto/opportunistic suspend will matter more and more, and my guess is that ten years most of our daily systems will be doing auto-suspend and will have proper wakeups from suspend implemented in hardware. Not just phones and gadgets but also portable tablets, book readers, TVs - and i wouldnt mind a non-portable, table sized tablet either ;-) At which point i'd hate to have some hack of a solution ingrained and ABI-ized with little chance to move user-space to sanity. But yes, i definitely agree with you that it all comes down to 'do we care': - If we care we should integrate it intelligently where it belongs conceptually: the idle drivers and the scheduler. - If we dont care then we should isolate the hacks as much as possible - and then the current suspend blocker patch-set is definitely a good basis to start. (with perhaps the /sys hackery cleaned up a bit, as you suggested) I dont favor either of the solutions too deeply - so i personally have not NAK-ed suspend blockers - i just saw a half a dozen semi-NAKs flying from other folks, so tried to help come up with a palatable design. _If_ most of x86 hardware was able to suspend race-free i think deeper integration would be a slam-dunk - as we could make it work almost everywhere. Sadly only a tiny subset of x86 qualifies, so the argument isnt obvious. Maybe we should pick a variant of suspend blockers and re-examine things in a few years? It being an ABI makes it difficult tho. What i would personally find unacceptable is to have _neither_ solutions - and the discussion was heading towards that stage really, with both sides digging the trenches of non-cooperation. IMHO we just cannot afford to let this drop on the floor as the feature is immensely useful to Android and thus to Linux at large. Anyway, i'm glad that it's up to you ;-) Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html