On Tue, Jan 31, 2023 at 11:36 PM Yongqin Liu <yongqin.liu@xxxxxxxxxx> wrote: > > Hi, Kees > > This change causes "Kernel panic - not syncing: BRK handler: Fatal exception" > for the android-mainline based hikey960 build, with this commit reverted, > there is no problem for the build to boot to the homescreen. > Not sure if you have any idea about it and give some suggestions. > > Here is part of the kernel panic log: > > [ 9.479878][ T122] ueventd: Loading module > /vendor/lib/modules/spi-pl022.ko with args '' > [ 9.480276][ T115] apexd-bootstrap: Pre-allocated loop device 29 > [ 9.480517][ T123] ueventd: LoadWithAliases was unable to load > of:Nhi3660_i2sT(null)Chisilicon,hi3660-i2s-1.0 > [ 9.480632][ T121] Unexpected kernel BRK exception at EL1 > [ 9.480637][ T121] Internal error: BRK handler: > 00000000f2000001 [#1] PREEMPT SMP > [ 9.480644][ T121] Modules linked in: cpufreq_dt(E+) > hisi_thermal(E+) phy_hi3660_usb3(E) btqca(E) hi6421_pmic_core(E) > btbcm(E) spi_pl022(E) hi3660_mailbox(E) i2c_designware_platform(E) > mali_kbase(OE) dw_mmc_k3(E) bluetooth(E) dw_mmc_pltfm(E) dw_mmc(E) > kirin_drm(E) rfkill(E) kirin_dsi(E) i2c_designware_core(E) k3dma(E) > drm_dma_helper(E) cma_heap(E) system_heap(E) > [ 9.480688][ T121] CPU: 4 PID: 121 Comm: ueventd Tainted: G > OE 6.2.0-rc6-mainline-14196-g1d9f94ec75b9 #1 > [ 9.480694][ T121] Hardware name: HiKey960 (DT) > [ 9.480697][ T121] pstate: 20400005 (nzCv daif +PAN -UAO -TCO > -DIT -SSBS BTYPE=--) > [ 9.480703][ T121] pc : hi3660_thermal_probe+0x6c/0x74 [hisi_thermal] > [ 9.480722][ T121] lr : hi3660_thermal_probe+0x38/0x74 [hisi_thermal] > [ 9.480733][ T121] sp : ffffffc00aa13700 > [ 9.480735][ T121] x29: ffffffc00aa13700 x28: 0000007ff8ae8531 > x27: 00000000000008c0 > [ 9.480743][ T121] x26: ffffffc00aa2a300 x25: ffffffc00aa2ab40 > x24: 000000000000001d > [ 9.480749][ T121] x23: ffffffc00a29d000 x22: 0000000000000000 > x21: ffffff8001fa4a80 > [ 9.480755][ T121] x20: 0000000000000001 x19: ffffff8001fa4a80 > x18: ffffffc00a8810b0 > [ 9.480761][ T121] x17: 000000007ab542f2 x16: 000000007ab542f2 > x15: ffffffc00aa01000 > [ 9.480767][ T121] x14: ffffffc00966f250 x13: ffffffc0b58f9000 > x12: ffffffc00a055f10 > [ 9.480771][ T123] ueventd: LoadWithAliases was unable to load > cpu:type:aarch64:feature:,0000,0001,0002,0003,0004,0005,0006,0007,000B > [ 9.480773][ T121] > [ 9.480774][ T121] x11: 0000000000000000 x10: 0000000000000001 > x9 : 0000000100000000 > [ 9.480780][ T123] ueventd: > [ 9.480780][ T121] x8 : ffffffc0044154cb x7 : 0000000000000000 > x6 : 000000000000003f > [ 9.480786][ T121] x5 : 0000000000000020 x4 : ffffffc0098db323 > x3 : ffffff801aeb62c0 > [ 9.480792][ T121] x2 : ffffff801aeb62c0 x1 : 0000000000000000 > x0 : ffffff8001fa4c80 > [ 9.480798][ T121] Call trace: > [ 9.480801][ T121] hi3660_thermal_probe+0x6c/0x74 [hisi_thermal] > [ 9.480813][ T121] hisi_thermal_probe+0xbc/0x284 [hisi_thermal] Taking a look here, it looks pretty obvious: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/thermal/hisi_thermal.c#n414 data->nr_sensors = 1; data->sensor = devm_kzalloc(dev, sizeof(*data->sensor) * data->nr_sensors, GFP_KERNEL); Here as nr_sensors=1, we allocate only one structure for the array. But then below that, we modify two entries, writing past the valid array, and corrupting data when writing the second sensor values. data->sensor[0].id = HI3660_BIG_SENSOR; data->sensor[0].irq_name = "tsensor_a73"; data->sensor[0].data = data; data->sensor[1].id = HI3660_LITTLE_SENSOR; data->sensor[1].irq_name = "tsensor_a53"; data->sensor[1].data = data; I suspect nr_sensors needs to be set to 2. Nice work, Kees! thanks -john