> > ... we think we can do better than ACPI. > Why exactly? Is there any info missing in the ACPI tables? > Or is this just to be more independent from OEMs? ACPI has a few fundmental flaws here. One is that it reports exit latency instead of break-even power duration. The other is that it requires a BIOS writer to get the tables right. Both of these are fatal flaws. There are also more subtle problems, like bogus ACPI implementations mapping LAPIC breaking C-states to ACPI-C2, causing Linux to need to assume the LAPIC is always broken in in C2 -- which is erroneous. I'll be speaking on this topic at length at Linuxcon this summer. > > Indeed, on my (production level commerically available) Nehalem desktop > > the ACPI tables are broken and an ACPI OS idles at 100W. With this > > driver the box idles at 85W. > What exactly was broken there? Dell's BIOS developer botched a bug fix immediately before the system went to market and disabled support for all ACPI C-states except C1. After several month of shipping systems, they still were unable to ship them with a fixed BIOS. Of course, besides a 15% idle power hit,the other effect of that BIOS issue was to disable all Turbo frequencies -- which is a somewhat important feature on a Core-i7 desktop... > IMO this is a step backward. I don't dispute your right to have an opinion:-) > CPUfreq runs rather well on nearly every machine supporting it without > tons of static frequency tables in kernel. Even powernow-k8 might get merged > into acpi-cpufreq. There are a couple of important differences between cpufreq and idle state enumeration. p-states are per-bin within each model. Idle states not only span bins within a model, they span multiple models which span multiple years. Note also the idle tables are validated at run-time by CPUID.MWAIT, which means the same table can be used for multiple parts -- the parts themselves know which states they have -- and they can tell us. So I don't expect a proliferation of idle tables in intel_idle. I do expect to tune some of the latencies based on some of the information that Intel instructs BIOS writers to convey, but they fail to convey. In particular, the actual latencies and power break-even points of the same model in different configurations are actually different. I've not seen a single BIOS get that part rigiht. I expect a new table to cover sandy bridge plus the generation after it. > Intel set up a huge ACPI API for this and now it's not used anymore?!? > Will these parts get obsoleted in a future spec? Both p-states and c-states will be moving to a more native enumeration method - but there will still be BIOS ACPI support wrapping that enumeration as long as somebody wants to run a legacy ACPI OS that knows nothing else. > While for C-states there are not that many static entries needed, another > drawback could be that OEMs will disable/hide C-states on purpose. Yes, there is a real possibility that a system has a device in it that malfunctions when a deep C-state is used. On Linux, we invented PM_QOS to address exactly this problem. The number of devices requiring PM_QOS users is still quite small. > Using ACPI table based C-states by default and using intel_idle.enable=1 > or similar for workarounds sounds safer. > At least as long as the driver is experimental. I plan to remove the EXPERIMENTAL in 1 release. > Does Windows use ACPI C-state info for idle? Yes, Windows uses ACPI. On the Dell above, that is why Linux consumes 15% less idle power and why Linux can take advantage of turbo mode and Windows can not. cheers, Len Brown, Intel Open Source Technology Center _______________________________________________ linux-pm mailing list linux-pm@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/linux-pm