Re: [PATCH] RDMA/ucma: Put a lock around every call to the rdma_cm layer

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Feb 18, 2020 at 09:04:36PM +0000, Jason Gunthorpe wrote:
> The rdma_cm must be used single threaded.
> 
> This appears to be a bug in the design, as it does have lots of locking
> that seems like it should allow concurrency. However, when it is all said
> and done every single place that uses the cma_exch() scheme is broken, and
> all the unlocked reads from the ucma of the cm_id data are wrong too.
> 
> syzkaller has been finding endless bugs related to this.
> 
> Fixing this in any elegant way is some enormous amount of work. Take a
> very big hammer and put a mutex around everything to do with the
> ucma_context at the top of every syscall.
> 
> Fixes: 75216638572f ("RDMA/cma: Export rdma cm interface to userspace")
> Reported-by: syzbot+adb15cf8c2798e4e0db4@xxxxxxxxxxxxxxxxxxxxxxxxx
> Reported-by: syzbot+e5579222b6a3edd96522@xxxxxxxxxxxxxxxxxxxxxxxxx
> Reported-by: syzbot+4b628fcc748474003457@xxxxxxxxxxxxxxxxxxxxxxxxx
> Reported-by: syzbot+29ee8f76017ce6cf03da@xxxxxxxxxxxxxxxxxxxxxxxxx
> Reported-by: syzbot+6956235342b7317ec564@xxxxxxxxxxxxxxxxxxxxxxxxx
> Reported-by: syzbot+b358909d8d01556b790b@xxxxxxxxxxxxxxxxxxxxxxxxx
> Reported-by: syzbot+6b46b135602a3f3ac99e@xxxxxxxxxxxxxxxxxxxxxxxxx
> Reported-by: syzbot+8458d13b13562abf6b77@xxxxxxxxxxxxxxxxxxxxxxxxx
> Reported-by: syzbot+bd034f3fdc0402e942ed@xxxxxxxxxxxxxxxxxxxxxxxxx
> Reported-by: syzbot+c92378b32760a4eef756@xxxxxxxxxxxxxxxxxxxxxxxxx
> Reported-by: syzbot+68b44a1597636e0b342c@xxxxxxxxxxxxxxxxxxxxxxxxx
> Cc: Eric Biggers <ebiggers@xxxxxxxxxx>
> Signed-off-by: Jason Gunthorpe <jgg@xxxxxxxxxxxx>
> ---
>  drivers/infiniband/core/ucma.c | 50 ++++++++++++++++++++++++++++++++--
>  1 file changed, 48 insertions(+), 2 deletions(-)
> 
> #syz test: git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git for-rc
> 
> Lets see if I told syzkaller about this properly..
> 
> EricB: If there are other rdma_cm related hits in syzkaller besides
> these 11 lets include them as  well. I wasn't able to find a way to
> search for things, this list is from your past email, thanks.
> 

Unfortunately I haven't had time to work on syzkaller bugs lately, so I can't
provide an updated list until I go through the long backlog of bugs.

A comment on the patch below:

> @@ -1112,13 +1134,17 @@ static ssize_t ucma_accept(struct ucma_file *file, const char __user *inbuf,
>  	if (cmd.conn_param.valid) {
>  		ucma_copy_conn_param(ctx->cm_id, &conn_param, &cmd.conn_param);
>  		mutex_lock(&file->mut);
> +		mutex_lock(&ctx->mutex);
>  		ret = __rdma_accept(ctx->cm_id, &conn_param, NULL);
> +		mutex_unlock(&ctx->mutex);
>  		if (!ret)
>  			ctx->uid = cmd.uid;
>  		mutex_unlock(&file->mut);

This is nesting the new ucma_context::mutex inside the existing ucma_file::mut.

> @@ -1403,6 +1443,7 @@ static ssize_t ucma_process_join(struct ucma_file *file,
>  	if (IS_ERR(ctx))
>  		return PTR_ERR(ctx);
>  
> +	mutex_lock(&ctx->mutex);
>  	mutex_lock(&file->mut);
>  	mc = ucma_alloc_multicast(ctx);
>  	if (!mc) {

... but this is doing the opposite.  So it can deadlock.

What's the intended order?

Also, are these two separate mutexes actually needed?  I.e., did you consider
using the existing ucma_file::mut, but it didn't work or it wasn't fine-grained
enough?  (It looks like one ucma_file can have multiple ucma_contexts.)

- Eric



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux