Hi Steev,
On 11/5/21 3:39 PM, Steev Klimaszewski wrote:
Hi Lukasz,
[snip]
I've been testing this patchset on the Lenovo Yoga C630, and today while
compiling alacritty and running an apt-get full-upgrade, I found the
following in dmesg output:
Thank you for testing and sending feedback!
Are you using a mainline kernel or you applied on some vendor production
kernel this patch set? I need to exclude a different code base
from the equation, especially to the arch_topology.c init code.
[ 194.343903] ------------[ cut here ]------------
[ 194.343912] WARNING: CPU: 4 PID: 192 at
drivers/base/arch_topology.c:188
topology_update_thermal_pressure+0xe4/0x100
[ 194.343928] Modules linked in: aes_ce_ccm snd_seq_dummy snd_hrtimer
snd_seq snd_seq_device algif_hash algif_skcipher af_alg bnep
cpufreq_ondemand cpufreq_conservative cpufreq_powersave
cpufreq_userspace lz4 lz4_compress zram zsmalloc q6asm_dai q6routing
q6afe_dai q6adm q6asm q6afe q6dsp_common snd_soc_wsa881x q6core
regmap_sdw snd_soc_wcd934x gpio_wcd934x soundwire_qcom snd_soc_wcd_mbhc
wcd934x regmap_slimbus uvcvideo videobuf2_vmalloc venus_enc venus_dec
videobuf2_dma_contig videobuf2_memops qrtr_smd fastrpc apr binfmt_misc
nls_ascii nls_cp437 vfat fat snd_soc_sdm845 snd_soc_rt5663
snd_soc_qcom_common pm8941_pwrkey joydev snd_soc_rl6231 aes_ce_blk
qcom_spmi_adc5 soundwire_bus qcom_vadc_common crypto_simd
qcom_spmi_temp_alarm snd_soc_core cryptd hci_uart snd_compress btqca
industrialio aes_ce_cipher snd_pcm_dmaengine btrtl crct10dif_ce btbcm
ghash_ce snd_pcm btintel gf128mul sha2_ce snd_timer venus_core bluetooth
snd v4l2_mem2mem sha256_arm64 videobuf2_v4l2 videobuf2_common soundcore
[ 194.344007] sha1_ce videodev ecdh_generic ecc mc ath10k_snoc
ath10k_core hid_multitouch ath mac80211 qcom_rng libarc4 qcom_q6v5_mss
cfg80211 sg rfkill qcom_q6v5_pas qcom_pil_info slim_qcom_ngd_ctrl
qcom_wdt pdr_interface qcom_q6v5 evdev rmtfs_mem slimbus qcom_sysmon
fuse configfs qrtr ip_tables x_tables autofs4 ext4 mbcache jbd2
panel_simple rtc_pm8xxx msm llcc_qcom ocmem gpu_sched ti_sn65dsi86
drm_dp_aux_bus drm_kms_helper drm camcc_sdm845 ipa qcom_common
qmi_helpers mdt_loader gpio_keys pwm_bl
[ 194.344056] CPU: 4 PID: 192 Comm: kworker/4:1H Not tainted 5.15.0 #9
[ 194.344060] Hardware name: LENOVO 81JL/LNVNB161216, BIOS
9UCN33WW(V2.06) 06/ 4/2019
[ 194.344062] Workqueue: events_highpri qcom_lmh_dcvs_poll
[ 194.344068] pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS
BTYPE=--)
[ 194.344070] pc : topology_update_thermal_pressure+0xe4/0x100
[ 194.344073] lr : topology_update_thermal_pressure+0x30/0x100
[ 194.344075] sp : ffff800014043d10
[ 194.344076] x29: ffff800014043d10 x28: 0000000000000000 x27:
0000000000000000
[ 194.344080] x26: ffff759c40b66974 x25: ffff759db37a2405 x24:
ffff759c40e83358
[ 194.344084] x23: 0000000000000000 x22: ffffb2699eafe1d8 x21:
00000000002d1e00
[ 194.344087] x20: ffff759c49f5bc20 x19: 000000000000b8cc x18:
0000000000000000
[ 194.344090] x17: 2f756e672d78756e x16: ffffb2699d7d83d0 x15:
0000000000000000
[ 194.344093] x14: 0000000000000000 x13: 0000000000000030 x12:
0101010101010101
[ 194.344096] x11: 7f7f7f7f7f7f7f7f x10: feff68716f676668 x9 :
ffffb2699dd66a58
[ 194.344099] x8 : fefefefefefefeff x7 : 000000000000000f x6 :
0000000000000002
[ 194.344102] x5 : ffffc33415124000 x4 : 0000000000000400 x3 :
0000000000000b19
[ 194.344105] x2 : 0000000000000b8c x1 : ffffb2699e678f40 x0 :
ffffb2699e678f48
[ 194.344108] Call trace:
[ 194.344110] topology_update_thermal_pressure+0xe4/0x100
[ 194.344113] qcom_lmh_dcvs_notify+0xc8/0x160
[ 194.344115] qcom_lmh_dcvs_poll+0x20/0x2c
[ 194.344116] process_one_work+0x1f4/0x490
[ 194.344120] worker_thread+0x188/0x504
[ 194.344121] kthread+0x12c/0x140
[ 194.344125] ret_from_fork+0x10/0x20
[ 194.344128] ---[ end trace bd0039c4fb892d5b ]---
[snip]
That's interesting why we hit this. I should have added info about
those two values, which are compared.
Could you make this change and try it again, please?
We would know the problematic values, which triggered this.
---------------------8<-----------------------------------
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index db18d79065fe..0d8db0927041 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -185,8 +185,11 @@ void topology_update_thermal_pressure(const struct
cpumask *cpus,
/* Convert to MHz scale which is used in 'freq_factor' */
capped_freq /= 1000;
- if (WARN_ON(max_freq < capped_freq))
+ if (max_freq < capped_freq) {
+ pr_warn("THERMAL_PRESSURE: max_freq (%lu) < capped_freq
(%lu) for CPUs [%*pbl]\n",
+ max_freq, capped_freq, cpumask_pr_args(cpus));
return;
+ }
capacity = mult_frac(capped_freq, max_capacity, max_freq);
------------------------------>8---------------------------
Could you also dump for me the cpufreq and capacity sysfs content?
$ grep . /sys/devices/system/cpu/cpu*/cpufreq/*
$ grep . /sys/devices/system/cpu/cpu*/cpu_capacity
Regards,
Lukasz