On 2011.06.01 at 13:34 -0400, David C Niemi wrote: > On 06/01/2011 12:08 PM, Markus Trippelsdorf wrote: > > There seems to be a major difference in the behavior of the ondemand > > governor depending on whether CONFIG_NO_HZ is set or not in the kernel > > .config. > > > > In the NO_HZ case the ondemand governor spends too much time at the > > highest frequency and is also very trigger happy. > > > > I have compared the two cases on my system: > > powernow-k8: Found 1 AMD Phenom(tm) II X4 955 Processor (4 cpu cores) (version 2.20.00) > > powernow-k8: 0 : pstate 0 (3200 MHz) > > powernow-k8: 1 : pstate 1 (2500 MHz) > > powernow-k8: 2 : pstate 2 (2100 MHz) > > powernow-k8: 3 : pstate 3 (800 MHz) > > > > When I run: > > watch -n.1 'cat /proc/cpuinfo|grep MHz' > > on an otherwise idle system, I can see that the frequency always stays > > at 800 MHz in the "CONFIG_NO_HZ not set" case. But it will very > > frequently switch to 3200 MHz in the CONFIG_NO_HZ=y case under the same > > conditions. > > > > This also manifests itself in the cpufreq/stats/time_in_state > > statistics (again on a mostly idle system): > > > > First taken with: > > echo 200 > /sys/devices/system/cpu/cpufreq/ondemand/sampling_down_factor > > (BTW wouldn't it make sense to use something like this as the default > > value?) > > > > cat /sys/devices/system/cpu/cpu0/cpufreq/stats/time_in_state > > > > CONFIG_NO_HZ not set: > > 3200000 5845 > > 2500000 0 > > 2100000 5 > > 800000 31552 > > > > CONFIG_NO_HZ=y: > > 3200000 17650 > > 2500000 0 > > 2100000 0 > > 800000 31129 > > > > > > And with the default sampling_down_factor=1 > > > > CONFIG_NO_HZ not set: > > 3200000 140 > > 2500000 2 > > 2100000 29 > > 800000 16614 > > > > CONFIG_NO_HZ=y: > > 3200000 538 > > 2500000 9 > > 2100000 77 > > 800000 16287 > > > > Now my question is, is this expected? And what could be done to make the > > NO_HZ behavior more like the "CONFIG_NO_HZ not set" behavior. > > A very interesting bit of information. What do you have set for > up_threshold? You may have to set it higher for CONFIG_NO_HZ than > without, based on your symptoms. Another thing to look at is your > sampling_rate. I'm guessing it differs between CONFIG_NO_HZ being set > or not. I've played with all those parameters, but unfortunately it didn't make any difference. > And perhaps you need to set sampling_down_factor a bit lower. I > consider 100 a reasonable default, but a default of "1" was put in > initially to make the behavior of the patch that enabled the factor > identical with not having the patch. If you are more concerned with > saving power than maximizing throughput, you might consider a much > lower value like 5 or 10. Yes, I've tried different values and 200 turned out to be the best based on my preferences (throughput over power saving). It makes a big difference in the compile time of bigger projects, especially during the configuration phase. But I have found the root cause of symptoms described above by bisection. It turned out that 2.6.39 is also affected, so I've bisected down to 2.6.38. This is the result: 5cb2c3bd0c5e0f3ced63f250ec2ad59d7c5c626a is the first bad commit commit 5cb2c3bd0c5e0f3ced63f250ec2ad59d7c5c626a Author: Vincent Guittot <vincent.guittot@xxxxxxxxxx> Date: Mon Feb 7 17:14:25 2011 +0100 [CPUFREQ] calculate delay after dbs_check_cpu When I revert the above in 3.0-rc1 the CONFIG_NO_HZ=y symptoms vanish. -- Markus -- To unsubscribe from this list: send the line "unsubscribe cpufreq" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html