Re: [PATCH rdma-next v1] RDMA/restrack: Don't rely on uninitialized variable in restrack_add flow

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 14, 2018 at 11:10:45AM -0500, Steve Wise wrote:
>
>
> > On Wed, Mar 14, 2018 at 10:39:03AM -0500, Steve Wise wrote:
> > > > On Wed, Mar 14, 2018 at 09:20:39AM -0500, Steve Wise wrote:
> > > > > >
> > > > > > From: Leon Romanovsky <leonro@xxxxxxxxxxxx>
> > > > > >
> > > > > > The restrack code relies on the fact that object structures are
> zeroed
> > > at
> > > > > > the allocation stage, the mlx4 CQ wasn't allocated with kzalloc
> and it
> > > > > > caused to the following crash.
> > > > > >
> > > > > > [  137.392209] general protection fault: 0000 [#1] SMP KASAN PTI
> > > > > > [  137.392972] CPU: 0 PID: 622 Comm: ibv_rc_pingpong Tainted: G
> > > W
> > > > > > 4.16.0-rc1-00099-g00313983cda6 #11
> > > > > > [  137.395079] Hardware name: QEMU Standard PC (i440FX + PIIX,
> 1996),
> > > > BIOS
> > > > > > 1.10.2-2.fc27 04/01/2014
> > > > > > [  137.396866] RIP: 0010:rdma_restrack_del+0xc8/0xf0
> > > > > > [  137.397762] RSP: 0018:ffff8801b54e7968 EFLAGS: 00010206
> > > > > > [  137.399008] RAX: 0000000000000000 RBX: ffff8801d8bcbae8 RCX:
> > > > > > ffffffffb82314df
> > > > > > [  137.400055] RDX: dffffc0000000000 RSI: dffffc0000000000 RDI:
> > > > > > 70696b533d454741
> > > > > > [  137.401103] RBP: ffff8801d90c07a0 R08: ffff8801d8bcbb00 R09:
> > > > > > 0000000000000000
> > > > > > [  137.402470] R10: 0000000000000001 R11: ffffed0036a9cf52 R12:
> > > > > > ffff8801d90c0ad0
> > > > > > [  137.403318] R13: ffff8801d853fb20 R14: ffff8801d8bcbb28 R15:
> > > > > > 0000000000000014
> > > > > > [  137.404736] FS:  00007fb415d43740(0000)
> GS:ffff8801e5c00000(0000)
> > > > > > knlGS:0000000000000000
> > > > > > [  137.406074] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > > > [  137.407101] CR2: 00007fb41557df20 CR3: 00000001b580c001 CR4:
> > > > > > 00000000003606b0
> > > > > > [  137.408308] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> > > > > > 0000000000000000
> > > > > > [  137.409352] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
> > > > > > 0000000000000400
> > > > > > [  137.410385] Call Trace:
> > > > > > [  137.411058]  ib_destroy_cq+0x23/0x60
> > > > > > [  137.411460]  uverbs_free_cq+0x37/0xa0
> > > > > > [  137.412040]  remove_commit_idr_uobject+0x38/0xf0
> > > > > > [  137.413042]  _rdma_remove_commit_uobject+0x5c/0x160
> > > > > > [  137.413782]  ? lookup_get_idr_uobject+0x39/0x50
> > > > > > [  137.414737]  rdma_remove_commit_uobject+0x3b/0x70
> > > > > > [  137.415742]  ib_uverbs_destroy_cq+0x114/0x1d0
> > > > > > [  137.416260]  ? ib_uverbs_req_notify_cq+0x160/0x160
> > > > > > [  137.417073]  ? kernel_text_address+0x5c/0x90
> > > > > > [  137.417805]  ? __kernel_text_address+0xe/0x30
> > > > > > [  137.418766]  ? unwind_get_return_address+0x2f/0x50
> > > > > > [  137.419558]  ib_uverbs_write+0x453/0x6a0
> > > > > > [  137.420220]  ? show_ibdev+0x90/0x90
> > > > > > [  137.420653]  ? __kasan_slab_free+0x136/0x180
> > > > > > [  137.421155]  ? kmem_cache_free+0x78/0x1e0
> > > > > > [  137.422192]  ? remove_vma+0x83/0x90
> > > > > > [  137.422614]  ? do_munmap+0x447/0x6c0
> > > > > > [  137.423045]  ? vm_munmap+0xb0/0x100
> > > > > > [  137.423481]  ? SyS_munmap+0x1d/0x30
> > > > > > [  137.424120]  ? do_syscall_64+0xeb/0x250
> > > > > > [  137.424984]  ? entry_SYSCALL_64_after_hwframe+0x21/0x86
> > > > > > [  137.425611]  ? lru_add_drain_all+0x270/0x270
> > > > > > [  137.426116]  ? lru_add_drain_cpu+0xa3/0x170
> > > > > > [  137.426616]  ? lru_add_drain+0x11/0x20
> > > > > > [  137.427058]  ? free_pages_and_swap_cache+0xa6/0x120
> > > > > > [  137.427672]  ? tlb_flush_mmu_free+0x78/0x90
> > > > > > [  137.428168]  ? arch_tlb_finish_mmu+0x6d/0xb0
> > > > > > [  137.428680]  __vfs_write+0xc4/0x350
> > > > > > [  137.430917]  ? kernel_read+0xa0/0xa0
> > > > > > [  137.432758]  ? remove_vma+0x90/0x90
> > > > > > [  137.434781]  ? __kasan_slab_free+0x14b/0x180
> > > > > > [  137.437486]  ? remove_vma+0x83/0x90
> > > > > > [  137.439836]  ? kmem_cache_free+0x78/0x1e0
> > > > > > [  137.442195]  ? percpu_counter_add_batch+0x1d/0x90
> > > > > > [  137.444389]  vfs_write+0xf7/0x280
> > > > > > [  137.446030]  SyS_write+0xa1/0x120
> > > > > > [  137.447867]  ? SyS_read+0x120/0x120
> > > > > > [  137.449670]  ? mm_fault_error+0x180/0x180
> > > > > > [  137.451539]  ? _cond_resched+0x16/0x50
> > > > > > [  137.453697]  ? SyS_read+0x120/0x120
> > > > > > [  137.455883]  do_syscall_64+0xeb/0x250
> > > > > > [  137.457686]  entry_SYSCALL_64_after_hwframe+0x21/0x86
> > > > > > [  137.459595] RIP: 0033:0x7fb415637b94
> > > > > > [  137.461315] RSP: 002b:00007ffdebea7d88 EFLAGS: 00000246
> > ORIG_RAX:
> > > > > > 0000000000000001
> > > > > > [  137.463879] RAX: ffffffffffffffda RBX: 00005565022d1bd0 RCX:
> > > > > > 00007fb415637b94
> > > > > > [  137.466519] RDX: 0000000000000018 RSI: 00007ffdebea7da0 RDI:
> > > > > > 0000000000000003
> > > > > > [  137.469543] RBP: 00007ffdebea7d98 R08: 0000000000000000 R09:
> > > > > > 00005565022d40c0
> > > > > > [  137.472479] R10: 00000000000009cf R11: 0000000000000246 R12:
> > > > > > 00005565022d2520
> > > > > > [  137.475125] R13: 00000000000003e8 R14: 0000000000000000 R15:
> > > > > > 00007ffdebea7fd0
> > > > > > [  137.477760] Code: f7 e8 dd 0d 0b ff 48 c7 43 40 00 00 00 00 48
> 89
> > > df e8
> > > > > 0d 0b
> > > > > > 0b ff 48 8d 7b 28 c6 03 00 e8 41 0d 0b ff 48 8b 7b 28 48 85 ff 74
> 06
> > > <f0>
> > > > > ff 4f 48
> > > > > > 74 10 5b 48 89 ef 5d 41 5c 41 5d 41 5e e9 32 b0 ee
> > > > > > [  137.483375] RIP: rdma_restrack_del+0xc8/0xf0 RSP:
> ffff8801b54e7968
> > > > > > [  137.486436] ---[ end trace 81835a1ea6722eed ]---
> > > > > > [  137.488566] Kernel panic - not syncing: Fatal exception
> > > > > > [  137.491162] Kernel Offset: 0x36000000 from 0xffffffff81000000
> > > > > (relocation
> > > > > > range: 0xffffffff80000000-0xffffffffbfffffff)
> > > > > >
> > > > > > Fixes: 00313983cda6 ("RDMA/nldev: provide detailed CM_ID
> > information")
> > > > > > Signed-off-by: Leon Romanovsky <leonro@xxxxxxxxxxxx>
> > > > > > ---
> > > > > >  drivers/infiniband/core/restrack.c | 5 ++---
> > > > > >  1 file changed, 2 insertions(+), 3 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/infiniband/core/restrack.c
> > > > > > b/drivers/infiniband/core/restrack.c
> > > > > > index e1d9934d6e81..91bf5df11608 100644
> > > > > > --- a/drivers/infiniband/core/restrack.c
> > > > > > +++ b/drivers/infiniband/core/restrack.c
> > > > > > @@ -113,13 +113,12 @@ void rdma_restrack_add(struct
> > > > rdma_restrack_entry
> > > > > > *res)
> > > > > >  	if (!dev)
> > > > > >  		return;
> > > > > >
> > > > > > +	res->task = NULL;
> > > > > >  	if (res_is_user(res)) {
> > > > > > -		if (!res->task)
> > > > > > -			rdma_restrack_set_task(res, current);
> > > > > > +		rdma_restrack_set_task(res, current);
> > > > > >  		res->kern_name = NULL;
> > > > > >  	} else {
> > > > > >  		set_kern_name(res);
> > > > > > -		res->task = NULL;
> > > > > >  	}
> > > > > >
> > > > > >  	kref_init(&res->kref);
> > > > > > --
> > > > > > 2.14.3
> > > > >
> > > > > This breaks cm_id pid/kern_name...  You just undid the change I
> added.
> > > NAK.
> > > > > The problem is that at the time the CMA adds the restrack resource,
> it
> > > is
> > > > > running in a workqueue context for some cases and that pid is
> incorrect.
> > > >
> > > > Not exactly, I used the fact that later the RDMA-CM will overwrite
> ->task.
> > > >
> > > > Before calling rdma_restrack_add, the code had one of two states:
> > > > 1. res->task == NULL => no change in behaviour
> > > > 2. res->task != NULL => before it skipped, now it will connect.
> > > >
> > >
> > > Hmm...  Do you get the correct PIDs for user mode and kernel mode
> cm_ids?
> > > Both active and passive sides of the connections?  Just run rping for
> the
> > > user side, and nvmef or iser on the kernel side, and verify all the pids
> are
> > > correct and/or kern_name's are correct.   If that is the case, then
> > > NAK->ACK...
> > >
> > >
> > > > What about such change on top of this fix?
> > > >
> > > > diff --git a/drivers/infiniband/core/restrack.c
> > > > b/drivers/infiniband/core/restrack.c
> > > > index 91bf5df11608..2e6ae88e830b 100644
> > > > --- a/drivers/infiniband/core/restrack.c
> > > > +++ b/drivers/infiniband/core/restrack.c
> > > > @@ -115,7 +115,9 @@ void rdma_restrack_add(struct rdma_restrack_entry
> > > > *res)
> > > >
> > > >          res->task = NULL;
> > > >          if (res_is_user(res)) {
> > > >  -               rdma_restrack_set_task(res, current);
> > > >  +               /* it will be set later for RDMA-CM */
> > > >  +               if (res->type != RDMA_RESTRACK_CM_ID)
> > > >  +                       rdma_restrack_set_task(res,  current);
> > > > 	                 res->kern_name = NULL;
> > > > 	         } else {
> > > > 	                 set_kern_name(res);
> > > > 		 }
> > > >
> > > >
> > >
> > > I think you would not want to set res->track to NULL for CM_ID restracks
> as
> > > well.  But a solution that special cases CM_ID is ok with me.
> >
> > You do need to set it to zero, otherwise subsequent call to
> > rdma_restrack_set_task() in RDMA-CM (e.g. __rdma_accept) path
> > will be dependant on zeroed res object.
> >
>
> I think you end up leaving a task referenced by setting res->task = NULL, if
> CMA actually set a task before calling restrack_add().  Which it does, IIRC.
>
>
> Why don't you just add an rdma_restrack_entry_init() and call it everywhere
> that has a restrack_entry embedded in it, and be done with it?

I wanted to preserve simple add<->del API, but it looks like that
I'll do init() function at the end.

Thanks

>
> Steve.
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux