Re: [PATCH rdma-next v6 3/8] RDMA/restrack: Add general infrastructure to track RDMA resources

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jan 25, 2018 at 02:00:26PM -0700, Jason Gunthorpe wrote:
> On Thu, Jan 25, 2018 at 10:27:10PM +0200, Leon Romanovsky wrote:
> > On Thu, Jan 25, 2018 at 01:10:23PM -0700, Jason Gunthorpe wrote:
> > > On Thu, Jan 25, 2018 at 09:44:36PM +0200, Leon Romanovsky wrote:
> > > > > > +	/*
> > > > > > +	 * @kref: Protect destroy of the resource
> > > > > > +	 */
> > > > > > +	struct kref		kref;
> > > > >
> > > > > Sticking a kref in a random structure that is itself not malloc'd is a
> > > > > big red flag:
> > > >
> > > > It is not "rand", but embedded part of ib_qp,ib_pd and ib_cq. It is
> > > > malloced as part of their creation.
> > >
> > > I mean the kref isn't in any way linked the lifetime of the malloc
> > > that contains it. So it isn't really a kref, it is just a refcount.
> >
> > For now, yes, in the future no. IMHO it is the direction to manage RDMA
> > objects.
>
> Maybe but until we do that this doesn't have struct kref semantics at
> all and shouldn't be called kref..

You are actually open-coded kref semantics with addition of completions.

>
> > > What I sent you wasn't remotely like this, it had two nested locks,
> > > the outer mutex and a very special open coded inner read/write lock
> > > that doesn't deadlock when nested..
> >
> > I still didn't like your approach, because it has three words which i don't want
> > to see here: "special open-coded variant". I believe that the way to go is to add
> > completion structure and convert _del to be something like that:
>
> Well, we can code the same idea with a completion, it is a little
> clearer but uses more memory for the per-object completion struct.
>
> Read side:
>   mutex_lock(list_lock)
>   for_each_list(...,obj...) { // Must NOT use _safe here
>      // The object is in process of being deleted
>      if (refcount_inc_not_zero(&obj->refcount))
>           continue;
>      mutex_unlock(list_lock);
>       ...
>      mutex_lock(list_lock);
>      if (refcount_dec_and_test(&obj->recount))
>          complete(&res->completion);
>   }
>
> Destroy side:
>    if (refcount_dec_and_test(&obj->ref))
>          complete(&obj->completion);
>    wait_for_completion(obj->free);
>
>    mutex_lock(list_lock)
>    list_del(obj);
>
>
> The refcount starts at 1 during init. Destroy triggers the freeing
> process by decr. Once the refcount reaches 0 it latches at 0 as 'to be
> destroyed' and the pointer can never leave the list_lock critical
> section.
>
> It is still tricky..

This is more or less how nldev does reads, with exception that counters
(kref) incremented and decremented under r/w semaphore, it eliminates the
check "if (refcount_inc_not_zero(&obj->refcount))" and instead complete
it calls to release (kref semantics).

It is worth to read the nldev part too.

The bug is that I didn't block rdma_restrack_del(), but the semantics
and locking scheme is right.

Thanks

>
> Jason

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux