> > Low Frequency Mode (LFM), aka Pn - the deepest P-state, > > is the lowest energy/instruction because it is this highest > > frequency available at the lowest voltage that can still > > retire instructions. > > > > That is why it is the first method used -- it returns the > > highest power_savings/performance_impact. > > For a straightforward workload on a dual package system, do you get more > performance from two packages running at their lowest P state or from > one package at its highest P state and a forced-idle package? > > Which consumes more power? For simplicity, lets say... Pn = P1/2 eg. P1=3GHz, and Pn=1.5 GHz. Lets assume that the application has zero cache and memory footprint and performance = CPU cycles. In that case, we could choose to either cut the frequency of 8 threads in half, or cut the frequency of 4 threads to 0 -- and we have a "cycle count" performance wash. Reality is that applications do care about memory, and so doubling the available cache is a performance win for P-states. If the application is something that notices latency of multiple threads sharing a CPU vs a thread/CPU, then P-states would again win because of more available cores. Current Intel multi-package systems don't quite get down to P1/2 -- so the deepest P-state would actually not have quite as much performance impact. Last AMD system I saw got down to 1GHz, so on that system P-state could (potentially) have a performance impact greater than off-lining. Power is more complicated. We discussed not saving much when an HT sibling is taken off line, right? The converse is also true, HT siblings are almost free in terms of power. So even though an HT sibling doesn't have the performance of a dedicated core, it is actually one of the best performance/watt parts of the system on many workloads. So you want to disable this last, not first; by exhausting p-states before you take your siblings off line. Lets discuss turbo-mode. Think of turbo in these terms... The ideal system to the HW designers is one that can spend a fixed power+electrical budget in any way on performance. ie. when some cores are idle, spend that budget to make the other cores go faster -- faster even than the advertised P0 frequency. As turbo has the highest voltage, it has the lowest efficiency in terms of instruction/energy. Thus it is very important that in power limited secarios that turbo (aka P0) be disabled first. There is a cross-over where a C-state that is highly optimized will be less impact on total system efficiency than a reduced P-state. However, we don't see that cross over on current Intel systems on workloads measured. The reason is that they set Pn to the minimum voltage where P-states are useful and run as fast as they can at that voltage. If the system were throttled say via T-states, down to an extremely low clock rate, then C-states would win sooner. -Len Brown, Intel Open Source Technolgy Center -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html