Re: [PATCH] IB/ucm: ib_ucm_release_dev clears wrong bit if devnum is greater than IB_UCM_MAX_DEVICES

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 6/12/2015 3:08 PM, Doug Ledford wrote:
On 06/11/2015 12:06 PM, clsoto@xxxxxxxxxxxxxxxxxx wrote:
From: Carol L Soto <clsoto@xxxxxxxxxxxxxxxxxx>

ib_ucm_release_dev clears wrong bit if devnum is greater than IB_UCM_MAX_DEVICES.

Signed-off-by: Carol L Soto <clsoto@xxxxxxxxxxxxxxxxxx>
---
  drivers/infiniband/core/ucm.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/core/ucm.c b/drivers/infiniband/core/ucm.c
index f2f6393..e2fd085 100644
--- a/drivers/infiniband/core/ucm.c
+++ b/drivers/infiniband/core/ucm.c
@@ -1193,6 +1193,7 @@ static int ib_ucm_close(struct inode *inode, struct file *filp)
  	return 0;
  }
+static DECLARE_BITMAP(overflow_map, IB_UCM_MAX_DEVICES);
  static void ib_ucm_release_dev(struct device *dev)
  {
  	struct ib_ucm_device *ucm_dev;
@@ -1202,7 +1203,7 @@ static void ib_ucm_release_dev(struct device *dev)
  	if (ucm_dev->devnum < IB_UCM_MAX_DEVICES)
  		clear_bit(ucm_dev->devnum, dev_map);
  	else
-		clear_bit(ucm_dev->devnum - IB_UCM_MAX_DEVICES, dev_map);
+		clear_bit(ucm_dev->devnum - IB_UCM_MAX_DEVICES, overflow_map);
  	kfree(ucm_dev);
  }
@@ -1226,7 +1227,6 @@ static ssize_t show_ibdev(struct device *dev, struct device_attribute *attr,
  static DEVICE_ATTR(ibdev, S_IRUGO, show_ibdev, NULL);
static dev_t overflow_maj;
-static DECLARE_BITMAP(overflow_map, IB_UCM_MAX_DEVICES);
  static int find_overflow_devnum(void)
  {
  	int ret;

This doesn't look right to me.  In particular, you are creating a bitmap
and clearing bits, but I never see that bitmap get any bits set.  So,
are you overflowing on the set routine and you just didn't bother to fix
that?
I just moved the declaration of the bitmap before ib_ucm_release_dev to be able to compile. This bitmap is used in ib_ucm_add_one in the case of devnum > IB_UCM_MAX_DEVICES. The error path in ib_ucm_add_one did the clear bit correctly but not the ib_ucm_release_dev function. We saw an issue when using SRIOV. The case was that user has 2 Mellanox SRIOV cards and just did an unbind/bind a VF that was using a ib ucm device with devnum greater than 32. We saw this warning
WARNING: at fs/sysfs/dir.c:31
May 31 14:04:11 powerkvm1-lp1 kernel: NIP [c000000000368958] sysfs_warn_dup+0x88/0xc0 May 31 14:04:11 powerkvm1-lp1 kernel: LR [c000000000368954] sysfs_warn_dup+0x84/0xc0
May 31 14:04:11 powerkvm1-lp1 kernel: Call Trace:
May 31 14:04:11 powerkvm1-lp1 kernel: [c000002e6bdbf510] [c000000000368954] sysfs_warn_dup+0x84/0xc0 (unreliable) May 31 14:04:11 powerkvm1-lp1 kernel: [c000002e6bdbf590] [c000000000368fcc] sysfs_do_create_link_sd.isra.2+0x15c/0x170 May 31 14:04:11 powerkvm1-lp1 kernel: [c000002e6bdbf5e0] [c0000000005aebb8] device_add+0x2f8/0x6f0 May 31 14:04:11 powerkvm1-lp1 kernel: [c000002e6bdbf690] [d00000001f5c2a48] ib_ucm_add_one+0x288/0x378 [ib_ucm] May 31 14:04:11 powerkvm1-lp1 kernel: [c000002e6bdbf730] [d00000001cef7468] ib_register_device+0x508/0x5b0 [ib_core] May 31 14:04:11 powerkvm1-lp1 kernel: [c000002e6bdbf830] [d000000019baba88] mlx4_ib_add+0x918/0x1320 [mlx4_ib] May 31 14:04:11 powerkvm1-lp1 kernel: [c000002e6bdbf950] [d000000019b32d10] mlx4_add_device+0x70/0x120 [mlx4_core] May 31 14:04:11 powerkvm1-lp1 kernel: [c000002e6bdbf990] [d000000019b3304c] mlx4_register_device+0x9c/0xf0 [mlx4_core] May 31 14:04:11 powerkvm1-lp1 kernel: [c000002e6bdbf9d0] [d000000019b38844] mlx4_load_one+0x1254/0x1600 [mlx4_core] May 31 14:04:11 powerkvm1-lp1 kernel: [c000002e6bdbfad0] [d000000019b39210] mlx4_init_one+0x620/0x770 [mlx4_core] May 31 14:04:11 powerkvm1-lp1 kernel: [c000002e6bdbfba0] [c0000000004fdf6c] local_pci_probe+0x6c/0x130 May 31 14:04:11 powerkvm1-lp1 kernel: [c000002e6bdbfc30] [c0000000000e5548] work_for_cpu_fn+0x38/0x60 May 31 14:04:11 powerkvm1-lp1 kernel: [c000002e6bdbfc60] [c0000000000ea518] process_one_work+0x198/0x470 May 31 14:04:11 powerkvm1-lp1 kernel: [c000002e6bdbfcf0] [c0000000000eaf0c] worker_thread+0x37c/0x5a0 May 31 14:04:11 powerkvm1-lp1 kernel: [c000002e6bdbfd80] [c0000000000f1f90] kthread+0x110/0x130 May 31 14:04:11 powerkvm1-lp1 kernel: [c000002e6bdbfe30] [c000000000009660] ret_from_kernel_thread+0x5c/0x7c

the warning was due to it was trying to create ib ucm device with devnum 0 but that was in use already by another device. With the patch
I created, then I do not see the warning.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux