RE: [PATCH rdma-next 1/3] IB/core: Fix kernel crash during fail to initialize device

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I lately saw the ack from you that its merged.

Thanks,
Parav

> -----Original Message-----
> From: Parav Pandit
> Sent: Monday, April 24, 2017 1:04 PM
> To: 'Leon Romanovsky' <leon@xxxxxxxxxx>; Doug Ledford
> <dledford@xxxxxxxxxx>
> Cc: linux-rdma@xxxxxxxxxxxxxxx; # v4 . 2+ <stable@xxxxxxxxxxxxxxx>
> Subject: RE: [PATCH rdma-next 1/3] IB/core: Fix kernel crash during fail to
> initialize device
> 
> Hi Doug,
> 
> Did you get a chance to merge this fix that avoids kernel crash in error scenario?
> 
> Parav
> 
> > -----Original Message-----
> > From: Leon Romanovsky [mailto:leon@xxxxxxxxxx]
> > Sent: Sunday, March 19, 2017 3:56 AM
> > To: Doug Ledford <dledford@xxxxxxxxxx>
> > Cc: linux-rdma@xxxxxxxxxxxxxxx; Parav Pandit <parav@xxxxxxxxxxxx>; #
> > v4 . 2+ <stable@xxxxxxxxxxxxxxx>
> > Subject: [PATCH rdma-next 1/3] IB/core: Fix kernel crash during fail
> > to initialize device
> >
> > From: Parav Pandit <parav@xxxxxxxxxxxx>
> >
> > This patch fixes the kernel crash that occurs during
> > ib_dealloc_device() called due to provider driver fails with an error
> > after
> > ib_alloc_device() and before it can register using ib_register_device().
> >
> > This crashed seen in tha lab as below which can occur with any IB
> > device which fails to perform its device initialization before invoking
> ib_register_device().
> >
> > This patch avoids touching cache and port immutable structures if
> > device is not yet initialized.
> > It also releases related memory when cache and port immutable data
> > structure initialization fails during register_device() state.
> >
> > [81416.561946] BUG: unable to handle kernel NULL pointer dereference
> > at (null) [81416.570340] IP: ib_cache_release_one+0x29/0x80 [ib_core]
> > [81416.576222] PGD 78da66067 [81416.576223] PUD 7f2d7c067
> > [81416.579484] PMD 0 [81416.582720] [81416.587242] Oops: 0000 [#1] SMP
> [81416.722395] task:
> > ffff8807887515c0 task.stack: ffffc900062c0000 [81416.729148] RIP:
> > 0010:ib_cache_release_one+0x29/0x80 [ib_core] [81416.735793] RSP:
> > 0018:ffffc900062c3a90 EFLAGS: 00010202 [81416.741823] RAX:
> > 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
> > [81416.749785] RDX: 0000000000000000 RSI: 0000000000000282 RDI:
> > ffff880859fec000 [81416.757757] RBP: ffffc900062c3aa0 R08:
> > ffff8808536e5ac0 R09: ffff880859fec5b0 [81416.765708] R10:
> > 00000000536e5c01 R11: ffff8808536e5ac0 R12: ffff880859fec000
> > [81416.773672] R13: 0000000000000000 R14: ffff8808536e5ac0 R15:
> > ffff88084ebc0060 [81416.781621] FS:  00007fd879fab740(0000)
> > GS:ffff88085fac0000(0000) knlGS:0000000000000000 [81416.790522] CS:
> > 0010
> > DS: 0000 ES: 0000 CR0: 0000000080050033 [81416.797094] CR2:
> > 0000000000000000 CR3: 00000007eb215000 CR4: 00000000003406e0
> > [81416.805051] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> > 0000000000000000 [81416.812997] DR3: 0000000000000000 DR6:
> > 00000000fffe0ff0 DR7: 0000000000000400 [81416.820950] Call Trace:
> > [81416.824226]  ib_device_release+0x1e/0x40 [ib_core] [81416.829858]
> > device_release+0x32/0xa0 [81416.834370]  kobject_cleanup+0x63/0x170
> > [81416.839058]  kobject_put+0x25/0x50 [81416.843319]
> > ib_dealloc_device+0x25/0x40 [ib_core] [81416.848986]
> > mlx5_ib_add+0x163/0x1990 [mlx5_ib] [81416.854414]
> > mlx5_add_device+0x5a/0x160 [mlx5_core] [81416.860191]
> > mlx5_register_interface+0x8d/0xc0 [mlx5_core] [81416.866587]  ?
> > 0xffffffffa09e9000 [81416.870816]  mlx5_ib_init+0x15/0x17 [mlx5_ib]
> > [81416.876094]  do_one_initcall+0x51/0x1b0 [81416.880861]  ?
> > __vunmap+0x85/0xd0 [81416.885113]  ?
> > kmem_cache_alloc_trace+0x14b/0x1b0
> > [81416.890768]  ? vfree+0x2e/0x70
> > [81416.894762]  do_init_module+0x60/0x1fa [81416.899441]
> > load_module+0x15f6/0x1af0 [81416.904114]  ? __symbol_put+0x60/0x60
> > [81416.908709]  ? ima_post_read_file+0x3d/0x80 [81416.913828]  ?
> > security_kernel_post_read_file+0x6b/0x80
> > [81416.920006]  SYSC_finit_module+0xa6/0xf0 [81416.924888]
> > SyS_finit_module+0xe/0x10 [81416.929568]
> > entry_SYSCALL_64_fastpath+0x1a/0xa9
> > [81416.935089] RIP: 0033:0x7fd879494949 [81416.939543] RSP:
> > 002b:00007ffdbc1b4e58 EFLAGS: 00000202 ORIG_RAX:
> > 0000000000000139 [81416.947982] RAX: ffffffffffffffda RBX:
> > 0000000001b66f00 RCX: 00007fd879494949 [81416.955965] RDX:
> > 0000000000000000 RSI: 000000000041a13c RDI: 0000000000000003
> > [81416.963926] RBP: 0000000000000003 R08: 0000000000000000 R09:
> > 0000000001b652a0 [81416.971861] R10: 0000000000000003 R11:
> > 0000000000000202 R12: 00007ffdbc1b3e70 [81416.979763] R13:
> > 00007ffdbc1b3e50 R14: 0000000000000005 R15: 0000000000000000
> > [81417.008005] RIP: ib_cache_release_one+0x29/0x80 [ib_core] RSP:
> > ffffc900062c3a90 [81417.016045] CR2: 0000000000000000
> >
> > Fixes: 55aeed0654 ("IB/core: Make ib_alloc_device init the kobject")
> > Fixes: 7738613e7c ("IB/core: Add per port immutable struct to
> > ib_device")
> > Cc: <stable@xxxxxxxxxxxxxxx> # v4.2+
> > Reviewed-by: Daniel Jurgens <danielj@xxxxxxxxxxxx>
> > Signed-off-by: Parav Pandit <parav@xxxxxxxxxxxx>
> > Signed-off-by: Leon Romanovsky <leon@xxxxxxxxxx>
> > ---
> >  drivers/infiniband/core/device.c | 33
> > ++++++++++++++++++++++-----------
> >  1 file changed, 22 insertions(+), 11 deletions(-)
> >
> > diff --git a/drivers/infiniband/core/device.c
> > b/drivers/infiniband/core/device.c
> > index 593d2ce6ec7c..64a2ae4d8eaa 100644
> > --- a/drivers/infiniband/core/device.c
> > +++ b/drivers/infiniband/core/device.c
> > @@ -172,8 +172,16 @@ static void ib_device_release(struct device *device)  {
> >  	struct ib_device *dev = container_of(device, struct ib_device, dev);
> >
> > -	ib_cache_release_one(dev);
> > -	kfree(dev->port_immutable);
> > +	WARN_ON(dev->reg_state == IB_DEV_REGISTERED);
> > +	if (dev->reg_state == IB_DEV_UNREGISTERED) {
> > +		/*
> > +		 * In IB_DEV_UNINITIALIZED state, cache or port table
> > +		 * is not even created. Free cache and port table only when
> > +		 * device reaches UNREGISTERED state.
> > +		 */
> > +		ib_cache_release_one(dev);
> > +		kfree(dev->port_immutable);
> > +	}
> >  	kfree(dev);
> >  }
> >
> > @@ -366,32 +374,27 @@ int ib_register_device(struct ib_device *device,
> >  	ret = ib_cache_setup_one(device);
> >  	if (ret) {
> >  		pr_warn("Couldn't set up InfiniBand P_Key/GID cache\n");
> > -		goto out;
> > +		goto port_cleanup;
> >  	}
> >
> >  	ret = ib_device_register_rdmacg(device);
> >  	if (ret) {
> >  		pr_warn("Couldn't register device with rdma cgroup\n");
> > -		ib_cache_cleanup_one(device);
> > -		goto out;
> > +		goto cache_cleanup;
> >  	}
> >
> >  	memset(&device->attrs, 0, sizeof(device->attrs));
> >  	ret = device->query_device(device, &device->attrs, &uhw);
> >  	if (ret) {
> >  		pr_warn("Couldn't query the device attributes\n");
> > -		ib_device_unregister_rdmacg(device);
> > -		ib_cache_cleanup_one(device);
> > -		goto out;
> > +		goto cache_cleanup;
> >  	}
> >
> >  	ret = ib_device_register_sysfs(device, port_callback);
> >  	if (ret) {
> >  		pr_warn("Couldn't register device %s with driver model\n",
> >  			device->name);
> > -		ib_device_unregister_rdmacg(device);
> > -		ib_cache_cleanup_one(device);
> > -		goto out;
> > +		goto cache_cleanup;
> >  	}
> >
> >  	device->reg_state = IB_DEV_REGISTERED; @@ -403,6 +406,14 @@ int
> > ib_register_device(struct ib_device *device,
> >  	down_write(&lists_rwsem);
> >  	list_add_tail(&device->core_list, &device_list);
> >  	up_write(&lists_rwsem);
> > +	mutex_unlock(&device_mutex);
> > +	return 0;
> > +
> > +cache_cleanup:
> > +	ib_cache_cleanup_one(device);
> > +	ib_cache_release_one(device);
> > +port_cleanup:
> > +	kfree(device->port_immutable);
> >  out:
> >  	mutex_unlock(&device_mutex);
> >  	return ret;
> > --
> > 2.12.0

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux