David, On 7/30/06, david singleton <dsingleton at mvista.com> wrote: > That's one of the simple parts of the concept. There aren't any > runtime operating > point creation. It's one of the things I like best about cpufreq, the > frequency > and voltages are taken from the hardware vendor data sheet and > validated. > > The user just gets to use the operating points supported by the system, > not > choose the frequency or voltage to transition to. > > By just presenting the supported operating points to the user it > removes the > need for new APIs. The user just reads the supported operating points > and decides the best use of the supported operating points. I see this approach as fundamentally wrong at least because it will produce very long and hard to manage lists of operating points. Suppose you have 20 hardware vendor approved core CPU frequency values, 3 possible voltage values and 10 approved DSP CPU frequency values (which are derived from the other PLL). Not too impossible is that almost all combinations are available which makes is almost 600 operating points. I find it absolutely unreal that anyone enters all that stuff without mistakes; managing those lists/searching thru them will take significant time which will slow down the state transitions; and, finally, it's gonna increase the kernel footprint quite a bit. It looks to me that the concept that the kernel can implement rules/restrictions for operating points but shouldn't define them with possible exception for the most essential ones far better suits both embedded and non-embedded use cases. > > 2) interface (kernel as well as userspace(sysfs)) for the rest of power > > parameters except cpu voltage and frequency > > > The /sys/power/supported_states file shows the supported operating > points > and their parameters. > > The platform specific information is hidden through the md_data pointer, > which in the case of embedded systems with complex clocking schemes, > contains the clock divisor and multiplier information that the system > needs > to perform frequency and voltage scaling and clock manipulation. > > The machine dependent portion of a centrino operating point > is only the perfctl msr bits for each frequency/voltage. For > a system with 5 power domains and various clocks the > machine dependent portion contains the whole array > of information for the different power domains and their clocks. Basically I don't see too much sense in your definition of PM_FREQ_CHANGE and PM_VOLT_CHANGE. The latter one just isn't used anywhere although the voltage differs between the operating points for your centrino example. And it's quite a common thing when frequency and voltage are changed within the same transition; so those either should be bitfields or something like PM_STATE_CHANGE. > > > > 3) per platform nature of an operating point rather than per > > a pm control layer (cpufreq for ex.): > > - you have cpu freq and voltage defined in common code > > while it's still possible that on a certain platform one would > > not be interested in control of these parameters > > Correct, but on all of the hardware with which I'm familiar cpu > frequency > and voltage are common components to power management. I do agree, but there might be different voltages and different CPU frequencies within the same SoC, so it will mean that you separate, say, two CPU frequencies between common code and SoC-specific code. Maybe it's still the way to go, but it makes things quite complicated to understand from scratch.