Re: [PATCH RFC] i2c: Fix deadlock on adapter removal

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Heiner,

Cc Luca and Thomas (they are interested in the issue).

On Tue, 4 Feb 2025 20:54:57 +0100
Heiner Kallweit <hkallweit1@xxxxxxxxx> wrote:

> i2c_del_adapter() can be called recursively if it has an i2c mux on
> the bus. This results in a deadlock.
> We use the lock to protect from parallel unregistering of clients in
> case driver and adapter are removed at the same time.
> The fix approach is based on the fact that the used iterators are
> klist-based. So it's safe to remove list elements during the iteration,
> and we don't have to acquire the core lock.
> As a result we just have to prevent that i2c_unregister_device() is
> executed in parallel for the same client. Use an atomic bit op for this
> purpose.
> 
> Fixes: 56a50667cbcf ("i2c: Replace list-based mechanism for handling auto-detected clients")
> Reported-by: Herve Codina <herve.codina@xxxxxxxxxxx>
> Signed-off-by: Heiner Kallweit <hkallweit1@xxxxxxxxx>

I tested the patch.

The deadlock is no more present but I have this new issue using the exact
same command as the one I used to detect the deadlock:

    # echo 30a40000.i2c > /sys/bus/platform/drivers/imx-i2c/unbind
    [   35.221250] Unable to handle ker
    ** replaying previous printk message **
    [   35.221250] Unable to handle kernel paging request at virtual address 70ffff8000818b85
    [   35.221298] Mem abort info:
    [   35.221330]   ESR = 0x0000000096000005
    [   35.221347]   EC = 0x25: DABT (current EL), IL = 32 bits
    [   35.221364]   SET = 0, FnV = 0
    [   35.221381]   EA = 0, S1PTW = 0
    [   35.221398]   FSC = 0x05: level 1 translation fault
    [   35.221415] Data abort info:
    [   35.221428]   ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
    [   35.221445]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
    [   35.221464]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
    [   35.221484] [70ffff8000818b85] address between user and kernel address ranges
    [   35.221806] Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
    [   35.291401] Modules linked in: fsl_ldb imx8mp_interconnect imx_interconnect leds_pca963x imx_cpufreq_dt imx8mm_thermal lm75 tmp103 rtc_snvs     snvs_pwrkey rtc_rs5c372 pwm_imx27 st_pressure_spi st_sensors_spi regmap_spi st_pressure_i2c st_pressure st_sensors_i2c industrialio_triggered_buffer kfifo_buf st_sensors iio_hwmon gpio_charger led_bl panel_simple opt3001 governor_userspace imx_bus imx8mp_hdmi_tx dw_hdmi dwmac_imx stmmac_platform stmmac pcs_xpcs phylink samsung_dsim imx_sdma imx_lcdif drm_dma_helper imx8mp_hdmi_pvi fsl_imx8_ddr_perf caam exc3000 error pwm_bl ti_sn65dsi83 hotplug_bridge drm_display_helper drm_kms_helper drm drm_panel_orientation_quirks backlight gehc_sunh_connector
    [   35.352122] CPU: 1 UID: 0 PID: 294 Comm: sh Not tainted 6.14.0-rc1+ #24
    [   35.358743] Hardware name: HCO board (DT)
    [   35.365273] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
    [   35.372239] pc : sysfs_remove_groups+0x20/0x70
    [   35.376695] lr : device_remove_attrs+0xb8/0x100
    [   35.381237] sp : ffff8000834db9a0
    [   35.384551] x29: ffff8000834db9a0 x28: ffff000000e58000 x27: 0000000000000000
    [   35.391700] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
    [   35.398850] x23: ffff00007bc611b0 x22: ffff800080c4fb01 x21: 0000000000000000
    [   35.405997] x20: 70ffff8000818b85 x19: ffff00007bc72028 x18: 0000000000000000
    [   35.413144] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000028
    [   35.420290] x14: 0000000000000000 x13: 0000000000008d0d x12: ffff000000e58918
    [   35.427437] x11: ffff8000820618a8 x10: 0000000000000000 x9 : ffff800080733d08
    [   35.434585] x8 : 0101010101010101 x7 : 7f7f7f7f7f7f7f7f x6 : 647165746f623200
    [   35.441733] x5 : ffff000000e58000 x4 : 0000000029dd16fe x3 : 0000000000000000
    [   35.448881] x2 : 0000000000000000 x1 : 70ffff8000818b85 x0 : ffff00007bc72028
    [   35.456030] Call trace:
    [   35.458478]  sysfs_remove_groups+0x20/0x70 (P)
    [   35.462932]  device_remove_attrs+0xb8/0x100
    [   35.467123]  device_del+0x144/0x410
    [   35.470616]  device_unregister+0x20/0x48
    [   35.474544]  i2c_unregister_device.part.0+0x64/0xc0
    [   35.479433]  __unregister_client+0x74/0x80
    [   35.483539]  device_for_each_child+0x68/0xc8
    [   35.487818]  i2c_del_adapter+0xac/0x198
    [   35.491663]  i2c_imx_remove+0x4c/0x190
    [   35.495419]  platform_remove+0x30/0x58
    [   35.499179]  device_remove+0x54/0x90
    [   35.502762]  device_release_driver_internal+0x1e8/0x250
    [   35.507994]  device_driver_detach+0x20/0x38
    [   35.512185]  unbind_store+0xbc/0xc8
    [   35.515680]  drv_attr_store+0x2c/0x48
    [   35.519349]  sysfs_kf_write+0x54/0x88
    [   35.523018]  kernfs_fop_write_iter+0x128/0x1c8
    [   35.527468]  vfs_write+0x290/0x398
    [   35.530876]  ksys_write+0x70/0x110
    [   35.534284]  __arm64_sys_write+0x24/0x38
    [   35.538212]  invoke_syscall+0x50/0x120
    [   35.541972]  el0_svc_common.constprop.0+0x48/0xf8
    [   35.546686]  do_el0_svc+0x28/0x40
    [   35.550009]  el0_svc+0x48/0x110
    [   35.553159]  el0t_64_sync_handler+0x144/0x168
    [   35.557522]  el0t_64_sync+0x198/0x1a0
    [   35.561198] Code: a9bd7bfd 910003fd a90153f3 aa0103f4 (f9400021) 
    [   35.567295] ---[ end trace 0000000000000000 ]---

I really don't know if this new issue is related to 56a50667cbcf ("i2c:
Replace list-based mechanism for handling auto-detected clients") or some
other patches but for sure it was not present in v6.13.

Can you have a look?

Feel free to ask for more information or tests I can do to help if needed.

Best regards,
Hervé




[Index of Archives]     [Linux GPIO]     [Linux SPI]     [Linux Hardward Monitoring]     [LM Sensors]     [Linux USB Devel]     [Linux Media]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux