Re: [PATCH v4 for-next 05/12] IB/cm: Share listening CM IDs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 20/05/2015 01:35, Jason Gunthorpe wrote:
> On Tue, May 19, 2015 at 12:35:45PM -0600, Jason Gunthorpe wrote:
>> On Sun, May 17, 2015 at 08:51:01AM +0300, Haggai Eran wrote:
>>> @@ -212,6 +212,8 @@ struct cm_id_private {
>>>  	spinlock_t lock;	/* Do not acquire inside cm.lock */
>>>  	struct completion comp;
>>>  	atomic_t refcount;
>>> +	/* Number of clients sharing this ib_cm_id. Only valid for listeners. */
>>> +	atomic_t sharecount;
>>
>> No need for this atomic, hold the lock
>>
>> The use of the atomic looks racy:
>>
>>> +	if (!atomic_dec_and_test(&cm_id_priv->sharecount)) {
>>> +		/* The id is still shared. */
>>> +		return;
>>> +	}
>>
>> Might race with this:
>>
>>> +		if (atomic_inc_return(&cm_id_priv->sharecount) == 1) {
>>> +			/* This ID is already being destroyed */
>>> +			atomic_dec(&cm_id_priv->sharecount);
>>> +			goto new_id;
>>> +		}
>>> +
>>
>> Resulting in use-after-free of cm_id_priv->sharecount
> 
> Actually, there is something else odd here.. I mentioned the above
> because there wasn't obvious ref'ing on the cm_id_priv. Looking closer
> the cm.lock should prevent use-after-free, but there is still no ref.
> 
> The more I look at this, the more I think it is sketchy. Don't try and
> merge sharecount and refcount together, 
I'm not sure what you mean here. The way I was thinking about it was
that sharecount = num of rdma_cm_ids sharing this listener, while
refcount = num of active internal uses of this ID (work items, timers,
etc.) Do you want refcount to also include the sharecount?

> after cm_find_listen is called
> you have to increment the refcount before dropping cm.lock.
> 
> Decrement the refcount when destroying a shared listen.
You mean to decrement event if listen_sharecount > 0, and the id isn't
destroyed, right? The code already calls cm_deref_id when
listen_sharecount = 0 of course.

> I also don't see how the 'goto new_id' can work, if cm_find_listen
> succeeds then __ib_cm_listen is guarenteed to fail.
> 
> Fix the locking to make that impossible, associate sharecount with the
> cm.lock and, rework how cm_destroy_id grabs the cm_id_priv->lock spinlock:
> 
> 	case IB_CM_LISTEN:
> 		spin_lock_irq(&cm.lock);
> 		if (cm_id_priv->sharecount != 0) {
> 		     cm_id_prv->sharecount--;
> 		     // paired with in in ib_cm_id_create_and_listen
> 		     atomic_dec(&cm_id_priv->refcount);
> 		     spin_unlock_irq(&cm.lock);
> 		     return;
> 		}
> 		rb_erase(&cm_id_priv->service_node, &cm.listen_service_table);
> 		spin_unlock_irq(&cm.lock);
> 	
> 		spin_lock_irq(&cm_id_priv->lock);
> 		cm_id->state = IB_CM_IDLE;
> 		spin_unlock_irq(&cm_id_priv->lock);
> 		break;
> 
> Now that condition is eliminated, the unneeded atomic is gone, and
> refcount still acts like a proper kref should.
Thanks, that looks like a better solution.

Haggai
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux