Re: v3.13.5 intel_pstate: cpufreq: __cpufreq_add_dev: ->get() failed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Some MSR values for the troubling CPU 1 (I haven't rebooted since the error):

Min P-state: 12
Max P-state: 34
Turbo P-state: 40

TSC:    1105403635263536
APERF:    153644110734887
MPERF:    142432205366417

Maybe a bit surprising that APERF is larger than MPERF since turbo
isn't working for CPU 1.

The APERF/MPERF ratio should be close to 1.0 at boot time. Are the
counters reset by Linux? Can there be a race condition that throws off
the ratio?

On 10 March 2014 06:23, Viresh Kumar <viresh.kumar@xxxxxxxxxx> wrote:
> Cc'ing relevant people..
>
> On Fri, Mar 7, 2014 at 11:49 PM, Patrik Lundquist
> <patrik.lundquist@xxxxxxxxx> wrote:
>> Hi,
>>
>> booting 3.13.5 on a dual socket Ivy Bridge-EP resulted in this error:
>>
>> [    0.194139] smpboot: CPU0: Intel(R) Xeon(R) CPU E5-2687W v2 @
>> 3.40GHz (fam: 06, model: 3e, stepping: 04)
>> ...
>> [    0.246755] x86: Booting SMP configuration:
>> [    0.250935] .... node  #0, CPUs:        #1  #2  #3  #4  #5  #6  #7
>> [    0.357648] .... node  #1, CPUs:    #8  #9 #10 #11 #12 #13 #14 #15
>> [    0.553293] x86: Booted up 2 nodes, 16 CPUs
>> [    0.557666] smpboot: Total of 16 processors activated (108850.19 BogoMIPS)
>> ...
>> [    5.210204] Intel P-state driver initializing.
>> [    5.232407] Intel pstate controlling: cpu 0
>> [    5.253628] Intel pstate controlling: cpu 1
>> [    5.274899] cpufreq: __cpufreq_add_dev: ->get() failed
>> [    5.294856] Intel pstate controlling: cpu 2
>> [    5.313553] Intel pstate controlling: cpu 3
>> [    5.332526] Intel pstate controlling: cpu 4
>> [    5.352347] Intel pstate controlling: cpu 5
>> [    5.372112] Intel pstate controlling: cpu 6
>> [    5.391097] Intel pstate controlling: cpu 7
>> [    5.410272] Intel pstate controlling: cpu 8
>> [    5.429092] Intel pstate controlling: cpu 9
>> [    5.447714] Intel pstate controlling: cpu 10
>> [    5.465872] Intel pstate controlling: cpu 11
>> [    5.482942] Intel pstate controlling: cpu 12
>> [    5.498414] Intel pstate controlling: cpu 13
>> [    5.513586] Intel pstate controlling: cpu 14
>> [    5.529200] Intel pstate controlling: cpu 15
>>
>> CPU 1 is alive and well but missing the cpufreq driver. The system is
>> running fine otherwise.
>>
>> Looking closer at the problem gives that intel_pstate_init_cpu() is
>> successful but intel_pstate_get(), which is called right after by
>> cpufreq, fails.
>>
>> Since all_cpu_data[1] is initialized it gives that sample->freq must
>> be zero. So the bug should be in intel_pstate_calc_busy() which
>> incorrectly sets sample->freq to zero.
>>
>> I guess cpu->pstate.max_pstate == 4000000 since that's what
>> cpuinfo_max_freq and scaling_max_freq is on the other cores.
>>
>> So the error is likely that core_pct is calculated to 0 in
>> intel_pstate.c:intel_pstate_calc_busy():
>>
>>     core_pct = div64_u64(int_tofp(sample->aperf * 100),
>>                  sample->mperf);
>>
>>
>>
>> Might be fixed by this commit but should be backported in that case:
>>
>> commit fcb6a15c2e7e76d493e6f91ea889ab40e1c643a4
>> Author: Dirk Brandewie <dirk.j.brandewie@xxxxxxxxx>
>> Date:   Mon Feb 3 08:55:31 2014 -0800
>>
>>     intel_pstate: Take core C0 time into account for core busy calculation
>>
>>
>>
>> My options to explore the problem further by backporting patches and
>> continuous reboots are a bit limited at the moment.
>>
>> Regards,
>> Patrik
>> --
>> To unsubscribe from this list: send the line "unsubscribe cpufreq" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe cpufreq" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Devel]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Forum]     [Linux SCSI]

  Powered by Linux