On Aug 1, 2006, at 3:09 AM, Matthew Locke wrote: > > On Jul 31, 2006, at 5:59 PM, david singleton wrote: > >> >> On Jul 30, 2006, at 4:02 AM, Vitaly Wool wrote: >> >>> David, >>> >>> On 7/30/06, david singleton <dsingleton at mvista.com> wrote: >>> >>>> That's one of the simple parts of the concept. There aren't any >>>> runtime operating >>>> point creation. It's one of the things I like best about cpufreq, >>>> the >>>> frequency >>>> and voltages are taken from the hardware vendor data sheet and >>>> validated. >>>> >>>> The user just gets to use the operating points supported by the >>>> system, >>>> not >>>> choose the frequency or voltage to transition to. >>>> >>>> By just presenting the supported operating points to the user it >>>> removes the >>>> need for new APIs. The user just reads the supported operating >>>> points >>>> and decides the best use of the supported operating points. >>> >>> I see this approach as fundamentally wrong at least because it will >>> produce very long and hard to manage lists of operating points. >>> Suppose you have 20 hardware vendor approved core CPU frequency >>> values, 3 possible voltage values and 10 approved DSP CPU frequency >>> values (which are derived from the other PLL). Not too impossible is >>> that almost all combinations are available which makes is almost 600 >>> operating points. I find it absolutely unreal that anyone enters all >>> that stuff without mistakes; managing those lists/searching thru them >>> will take significant time which will slow down the state >>> transitions; >>> and, finally, it's gonna increase the kernel footprint quite a bit. >> >> Actually in practice there aren't that many supported operating >> points, even on the hardware you and I are familiar with. I've yet >> to construct a case where there are more than 16 to 20 >> operating points. > > Its not the number of operating points driving the need for run time > creation. Please read the thread that took place early last week on > this topic. Start from my post here: > http://lists.osdl.org/pipermail/linux-pm/2006-July/003065.html and read > backwards. > > Its really the embedded device development and silicon vendor model > driving it. Run time creation is required and enabling run time > creation doesn't prevent some architectures/board ports from hard > coding their points. > >> >> And the Linux device model allows the system to be set at >> a particular operating point and then suspending the LCD >> or unused USB if so desired. So the combination flexibility >> is still available. >> >> If there were 600 supported operating points that would be a >> very good reason to use PowerOp. I'm not sure I'd want >> the user passing all the frequencies, voltages, clock >> divisor and clock multiplier for all those operating points. > > Well, no one is suggesting a user define and install that info. > Operating point creation will be done by someone who understands the > system (system designer) regardless of the method used to get the > operating points in the kernel. > >> >> List manipulation takes place at compile time and list traversal >> is simple. If a powerop were to become a kobject management >> and traversal would still be simple. >> >> The foot print actually shrinks if you take into account all the >> class, policy and governor code that wouldn't be needed if >> all supported states were simple operating points. >> >>> >>> It looks to me that the concept that the kernel can implement >>> rules/restrictions for operating points but shouldn't define them >>> with >>> possible exception for the most essential ones far better suits both >>> embedded and non-embedded use cases. >> >> CPUFREQ shows that it can, and I believe should, define the operating >> points the system supports. CPUFREQ does NOT let the user pass >> frequency or voltage values into the kernel. It shows the hardware >> vendor certified and validated frequencies and voltages. >> >> I really like that concept. It simplifies things greatly. >> >>> >>>>> 2) interface (kernel as well as userspace(sysfs)) for the rest of >>>> power >>>>> parameters except cpu voltage and frequency >>>> >>>> >>>> The /sys/power/supported_states file shows the supported operating >>>> points >>>> and their parameters. >>>> >>>> The platform specific information is hidden through the md_data >>>> pointer, >>>> which in the case of embedded systems with complex clocking schemes, >>>> contains the clock divisor and multiplier information that the >>>> system >>>> needs >>>> to perform frequency and voltage scaling and clock manipulation. >>>> >>>> The machine dependent portion of a centrino operating point >>>> is only the perfctl msr bits for each frequency/voltage. For >>>> a system with 5 power domains and various clocks the >>>> machine dependent portion contains the whole array >>>> of information for the different power domains and their clocks. >>> >>> Basically I don't see too much sense in your definition of >>> PM_FREQ_CHANGE and PM_VOLT_CHANGE. The latter one just isn't used >>> anywhere although the voltage differs between the operating points >>> for >>> your centrino example. And it's quite a common thing when frequency >>> and voltage are changed within the same transition; so those either >>> should be bitfields or something like PM_STATE_CHANGE. >> >> >> The example patch isn't provided to show how it should be implemented. >> >> I've added a separate PowerOp state of PM_VOLT_CHANGE for >> hardware that may be changing states by changing a voltage rather >> than having the voltage changed as a side effect of changing the >> frequency explicitly. >> >>> >>>>> >>>>> 3) per platform nature of an operating point rather than per >>>>> a pm control layer (cpufreq for ex.): >>>>> - you have cpu freq and voltage defined in common code >>>>> while it's still possible that on a certain platform one >>>>> would >>>>> not be interested in control of these parameters >>>> >>>> Correct, but on all of the hardware with which I'm familiar cpu >>>> frequency >>>> and voltage are common components to power management. >>> >>> I do agree, but there might be different voltages and different CPU >>> frequencies within the same SoC, so it will mean that you separate, >>> say, two CPU frequencies between common code and SoC-specific code. >>> Maybe it's still the way to go, but it makes things quite complicated >>> to understand from scratch. >>> >> >> After digging through all the PM, CPUFREQ and Dynamic Power >> Management >> code it became apparent that when they get down to touching hardware >> they are just dealing with an operating point. And they all change >> from >> one opeating point to another in the same manner. >> >> Once you view all the states a system can be in as an operating point, >> wether >> its a suspend or frequency change, things get much simpler. > > >> And >> >> David >> >> _______________________________________________ >> linux-pm mailing list >> linux-pm at lists.osdl.org >> https://lists.osdl.org/mailman/listinfo/linux-pm >> > > _______________________________________________ > linux-pm mailing list > linux-pm at lists.osdl.org > https://lists.osdl.org/mailman/listinfo/linux-pm >