On Thu, Jun 3, 2010 at 7:16 PM, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > > > On Fri, 4 Jun 2010, Ingo Molnar wrote: >> >> What you say is absolutely true, hence this would be driven via sched_tick() + >> TIF notifiers - i.e. only ever treat user-mode tasks as 'idle-able'. This can >> be done with no overhead to the regular fastpaths. >> >> The TIF notifier would be the one scheduling to idle - and would thus do it >> only to user-mode tasks. > > The thing is, unless there is some _really_ deep other reason to do > something like this, I still think it's total overdesign to push any > knowledge/choices like this into the scheduler. I'd rather keep things way > more independent, less tied to each other and to deep kernel subsystems. > > IOW, my personal opinion is that somethng like a suspend (blocker or not) > decision simply shouldn't be important enough to be tied into the > scheduler. Especially not if it could just be its own layer. > > That said, as far as I know, the Android people have mostly been looking > at the suspend angle from a single-core standpoint. And I'm not at all > convinced that they should hijack the existing "/sys/power/state" thing > which is what I think they do now. > While it is true that we have not used this code on a multi core system yet, I'm not sure why multiple cores codes would affect it. We annotate that works needs to be done before it is safe to suspend, but we don't care which core does the work (or if multiple cores do pieces of it). > And those two things go together. The /sys/power/state thing is a global > suspend - which I don't think is appropriate for a opportunistic thing in > the first place, especially for multi-core. > > A well-designed opportunistic suspend should be a two-phase thing: an > opportunistc CPU hotunplug (shutting down cores one by one as the system > is idle), and not a "global" event in the first place. And only when > you've reached single-core state should you then say "do I suspend the > system too". > This seems to fit better into the cpuidle and/or frequency scaling framework. > So I've tried to look a bit at the patches, and my admittedly rough > comments so far is > > - I really do prefer the "off to the side" approach that the current > google opportunistic suspend patches have. As mentioned, I don't think > this should be deep in the scheduler. Not at all. > > - I do think there are possibly races and CPU idle issues there, but I > think they are mainly for the multi-core thing. And I think that's a > totally separate issue. Or it _should_ be. > I'm not aware of any races with multi-core systems unless there are existing problems in suspend. We check if any suspend blockers are active after disable_nonboot_cpus() has returned. > - once you're single-core (whether because you never had more cores to > begin with, or because the "opportunistic CPU offlining" has taken down > the other cores), I think the suspend-blocker is fine as a concept, and > certainly shouldn't need any deep scheduler hooks. > > so I'd like to see the opportunistc suspend thing think about CPU > offlining, I see this as a separate problem. We ignore a single busy CPU for opportunistic suspend, so why should the number of online CPUs matter? > and I'd like to see it disconnect from the existing > /sys/power/state. The entry point is not important to us. The current interface is what Rafael wanted instead of the /sys/power/request-state interface which is what we changed it to last year. > And I'd really not like to involved deep internal kernel > hooks into it. > > But I'll also admit that maybe I'm not seeing some problems. I've frankly > tried to avoid the whole discussion until Andrew pulled me in yesterday. > > Linus > -- Arve Hjønnevåg _______________________________________________ linux-pm mailing list linux-pm@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/linux-pm