Re: [PATCH for-rc] IB/hfi1: Properly allocate rdma counter desc memory

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Nov 15, 2021 at 03:09:13PM -0500, Dennis Dalessandro wrote:
> When optional counter support was added the allocation of the memory holding the
> counter descriptors was not cleared properly. This caused massive WARN_ON()s in
> IB/sysfs code to be hit. There is an assumption made that optional counters must
> not come before required counters. This is determiend by the flags field which
> was not zeroed.
> 
> The result is the console is flooded with WARN_ON for over 3 minutes on driver
> load. We can fix by simply using kzalloc vs kmalloc. While here change the
> sizeof() calls to use the pointer rather than the name of the type.
> 
> [77952.529518] ------------[ cut here ]------------
> [77952.535428] WARNING: CPU: 0 PID: 32644 at
> drivers/infiniband/core/sysfs.c:1064 ib_setup_port_attrs+0x7e1/0x890 [ib_core]
> [77952.548374] Modules linked in: hfi1(+) rdmavt ib_ipoib ib_isert ib_iser
> ib_umad rdma_ucm ib_uverbs rpcrdma ib_srpt ib_srp rdma_cm iw_cm ib_cm ib_core
> nfsd nfs_acl scsi_transport_srp rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver
> nfs lockd grace fscache netfs rfkill sunrpc iscsi_target_mod target_core_mod
> libiscsi scsi_transport_iscsi vfat fat iTCO_wdt iTCO_vendor_support mxm_wmi
> sb_edac x86_pkg_temp_thermal intel_powerclamp mgag200 coretemp crct10dif_pclmul
> drm_kms_helper crc32_pclmul syscopyarea ghash_clmulni_intel sysfillrect ipmi_si
> sysimgblt fb_sys_fops aesni_intel mei_me i2c_i801 ipmi_devintf crypto_simd
> i2c_algo_bit drm i2c_smbus lpc_ich cryptd pcspkr ipmi_msghandler mfd_core mei
> i2c_core ioatdma wmi acpi_power_meter acpi_pad sch_fq_codel ip_tables xfs
> libcrc32c sd_mod t10_pi sg ixgbe ahci mdio libahci ptp crc32c_intel pps_core
> libata dca [last unloaded: ib_core]
> [77952.640387] CPU: 0 PID: 32644 Comm: kworker/0:2 Tainted: G S      W
> 5.15.0+ #36
> [77952.650229] Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS
> SE5C610.86B.01.01.0018.C4.072020161249 07/20/2016
> [77952.663077] Workqueue: events work_for_cpu_fn
> [77952.668831] RIP: 0010:ib_setup_port_attrs+0x7e1/0x890 [ib_core]
> [77952.676337] Code: 48 83 7b 70 00 0f 84 e4 f9 ff ff e9 17 fe ff ff 31 c0 e9 4b
> fb ff ff 48 89 ef 89 04 24 e8 67 d0 a8 e0 8b 04 24 e9 1a fb ff ff <0f> 0b 49 8b
> 10 e9 de fe ff ff ba 34 00 00 00 be c0 0d 00 00 44 89
> [77952.699056] RSP: 0018:ffffc90006ea3c40 EFLAGS: 00010202
> [77952.705749] RAX: 0000000000000068 RBX: ffff888106ad8000 RCX: 0000000000000138
> [77952.714567] RDX: ffff888126c84c00 RSI: ffff888103c41000 RDI: 0000000000000124
> [77952.723370] RBP: ffff88810f63a801 R08: ffff888126c8a000 R09: 0000000000000001
> [77952.732156] R10: ffffffffa09acf20 R11: 0000000000000065 R12: ffff88810f63a800
> [77952.740943] R13: ffff88810f63a800 R14: ffff88810f63a8e0 R15: 0000000000000001
> [77952.749717] FS:  0000000000000000(0000) GS:ffff888667a00000(0000)
> knlGS:0000000000000000
> [77952.759556] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [77952.766765] CR2: 00005590102cb078 CR3: 000000000240a003 CR4: 00000000001706f0
> [77952.775527] Call Trace:
> [77952.779051]  ib_register_device.cold.44+0x23e/0x2d0 [ib_core]
> [77952.786298]  ? __vmalloc_node_range+0x1fb/0x320
> [77952.792158]  ? __vmalloc_node+0x44/0x70
> [77952.797234]  rvt_register_device+0xfa/0x230 [rdmavt]
> [77952.803568]  hfi1_register_ib_device+0x623/0x690 [hfi1]
> [77952.810238]  init_one.cold.36+0x2d1/0x49b [hfi1]
> [77952.816236]  local_pci_probe+0x45/0x80
> [77952.821189]  work_for_cpu_fn+0x16/0x20
> [77952.826132]  process_one_work+0x1b1/0x360
> [77952.831368]  worker_thread+0x1d4/0x3a0
> [77952.836310]  ? process_one_work+0x360/0x360
> [77952.841741]  kthread+0x11a/0x140
> [77952.846098]  ? set_kthread_struct+0x40/0x40
> [77952.851521]  ret_from_fork+0x22/0x30
> [77952.856257] ---[ end trace eadcb3e247decd87 ]---
> [77952.862174] ------------[ cut here ]------------
> 
> 
> Fixes: 5e2ddd1e5982 ("RDMA/counter: Add optional counter support")
> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@xxxxxxxxxxxxxxxxxxxx>
> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@xxxxxxxxxxxxxxxxxxxx>
> ---
>  drivers/infiniband/hw/hfi1/verbs.c |    5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)

Applied to for-rc, thanks

Jason



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux