Re: [PATCH v2 0/4] Fix device_lock deadlock on two probe() paths

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Jason,

On 09.08.2023 16:43, Jason Gunthorpe wrote:
> I missed two paths where __iommu_probe_device() can be called while
> already holding the device_lock() for the device that is to be probed.
>
> This causes a deadlock because __iommu_probe_device() will attempt to
> re-acquire the lock.
>
> Organize things so that these two paths can re-use the existing already
> held device_lock by marking the call chains based on if the lock is held
> or not.
>
> Also the order of iommu_init_device() was not correct for
> generic_single_device_group()

I've just noticed that there is still one more issue left to fix after 
applying this patchset. On Qualcomm's RB5 board I get the following 
warning (captured on vanilla next-20230817):

------------[ cut here ]------------
WARNING: CPU: 6 PID: 274 at include/linux/device.h:1012 
iommu_probe_device_locked+0xd4/0xe4
Modules linked in: apr pdr_interface venus_dec venus_enc 
videobuf2_dma_contig rpmsg_ctrl fastrpc qrtr_smd rpmsg_char 
lontium_lt9611uxc msm mcp251xfd qcom_camss can_dev videobuf2_dma_sg 
ocmem imx412 gpu_sched venus_core snd_soc_sm8250 phy_qcom_qmp_combo 
videobuf2_memops v4l2_fwnode leds_qcom_lpg v4l2_mem2mem drm_dp_aux_bus 
v4l2_async snd_soc_qcom_sdw videobuf2_v4l2 ax88179_178a onboard_usb_hub 
videodev rtc_pm8xxx qcom_spmi_adc5 qcom_pon qcom_spmi_adc_tm5 
led_class_multicolor qcom_spmi_temp_alarm qcom_vadc_common crct10dif_ce 
i2c_qcom_geni snd_soc_qcom_common qrtr drm_display_helper 
videobuf2_common camcc_sm8250 mc typec i2c_qcom_cci spi_geni_qcom 
qcom_stats llcc_qcom phy_qcom_qmp_usb qcom_rng icc_bwmon qcom_q6v5_pas 
qcrypto phy_qcom_snps_femto_v2 qcom_pil_info sha256_generic 
soundwire_qcom coresight_stm coresight_tmc stm_core coresight_funnel 
coresight_replicator libsha256 qcom_q6v5 pinctrl_sm8250_lpass_lpi 
soundwire_bus snd_soc_lpass_va_macro snd_soc_lpass_macro_common 
pinctrl_lpass_lpi coresight authenc
  lpass_gfm_sm8250 qcom_sysmon slimbus snd_soc_lpass_wsa_macro libdes 
qcom_common qcom_glink_smem socinfo mdt_loader qmi_helpers 
phy_qcom_qmp_pcie icc_osm_l3 qcom_wdt display_connector ip_tables 
x_tables ipv6
CPU: 6 PID: 274 Comm: kworker/u16:8 Not tainted 6.5.0-rc6-next-20230817 
#13867
Hardware name: Qualcomm Technologies, Inc. Robotics RB5 (DT)
Workqueue: events_unbound deferred_probe_work_func
pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : iommu_probe_device_locked+0xd4/0xe4
lr : iommu_probe_device_locked+0x78/0xe4
...
Call trace:
  iommu_probe_device_locked+0xd4/0xe4
  of_iommu_configure+0x10c/0x200
  of_dma_configure_id+0x104/0x3b8
  a6xx_gmu_init+0x4c/0xccc [msm]
  a6xx_gpu_init+0x3ac/0x770 [msm]
  adreno_bind+0x174/0x2ac [msm]
  component_bind_all+0x118/0x24c
  msm_drm_bind+0x1e8/0x6c4 [msm]
  try_to_bring_up_aggregate_device+0x168/0x1d4
  __component_add+0xa8/0x170
  component_add+0x14/0x20
  dsi_dev_attach+0x20/0x2c [msm]
  dsi_host_attach+0x9c/0x144 [msm]
  devm_mipi_dsi_attach+0x34/0xb4
  lt9611uxc_attach_dsi.isra.0+0x84/0xfc [lontium_lt9611uxc]
  lt9611uxc_probe+0x5c8/0x68c [lontium_lt9611uxc]
  i2c_device_probe+0x14c/0x290
  really_probe+0x148/0x2b4
  __driver_probe_device+0x78/0x12c
  driver_probe_device+0x3c/0x160
  __device_attach_driver+0xb8/0x138
  bus_for_each_drv+0x84/0xe0
  __device_attach+0xa8/0x1b0
  device_initial_probe+0x14/0x20
  bus_probe_device+0xb0/0xb4
  deferred_probe_work_func+0x8c/0xc8
  process_one_work+0x1ec/0x53c
  worker_thread+0x298/0x408
  kthread+0x124/0x128
  ret_from_fork+0x10/0x20
irq event stamp: 207712
hardirqs last  enabled at (207711): [<ffffcfe16b46b160>] 
_raw_spin_unlock_irqrestore+0x74/0x78
hardirqs last disabled at (207712): [<ffffcfe16b458630>] el1_dbg+0x24/0x8c
softirqs last  enabled at (206480): [<ffffcfe16a290a10>] 
__do_softirq+0x438/0x4ec
softirqs last disabled at (206473): [<ffffcfe16a296980>] 
____do_softirq+0x10/0x1c
---[ end trace 0000000000000000 ]---
------------[ cut here ]------------

Let me know if you need more information or some tests on the hardware.


>
> v2:
>   - New patch to correct the order of setting iommu_dev during
>     iommu_init_device()
>   - Spelling fixes
>   - Simply block probing the iommu device itself instead of trying to do it
> v1: https://lore.kernel.org/r/0-v1-8612b9ef48da+333-iommu_group_locking2_jgg@xxxxxxxxxx
>
> Jason Gunthorpe (4):
>    iommu: Provide iommu_probe_device_locked()
>    iommu: Pass in the iommu_device to probe for in bus_iommu_probe()
>    iommu: Do not attempt to re-lock the iommu device when probing
>    iommu: dev->iommu->iommu_dev must be set before ops->device_group()
>
>   drivers/acpi/scan.c        |  5 +--
>   drivers/iommu/iommu.c      | 65 +++++++++++++++++++++++++++-----------
>   drivers/iommu/of_iommu.c   |  2 +-
>   drivers/iommu/omap-iommu.c | 12 +++++--
>   include/linux/iommu.h      |  6 +++-
>   5 files changed, 65 insertions(+), 25 deletions(-)
>
>
> base-commit: 8d3740021d5d461e1ec4c17fc5625b9b4237c2f8

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland




[Index of Archives]     [Linux IBM ACPI]     [Linux Power Management]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]
  Powered by Linux