On Thu, Jan 25, 2018 at 02:00:26PM -0700, Jason Gunthorpe wrote: > On Thu, Jan 25, 2018 at 10:27:10PM +0200, Leon Romanovsky wrote: > > On Thu, Jan 25, 2018 at 01:10:23PM -0700, Jason Gunthorpe wrote: > > > On Thu, Jan 25, 2018 at 09:44:36PM +0200, Leon Romanovsky wrote: > > > > > > + /* > > > > > > + * @kref: Protect destroy of the resource > > > > > > + */ > > > > > > + struct kref kref; > > > > > > > > > > Sticking a kref in a random structure that is itself not malloc'd is a > > > > > big red flag: > > > > > > > > It is not "rand", but embedded part of ib_qp,ib_pd and ib_cq. It is > > > > malloced as part of their creation. > > > > > > I mean the kref isn't in any way linked the lifetime of the malloc > > > that contains it. So it isn't really a kref, it is just a refcount. > > > > For now, yes, in the future no. IMHO it is the direction to manage RDMA > > objects. > > Maybe but until we do that this doesn't have struct kref semantics at > all and shouldn't be called kref.. You are actually open-coded kref semantics with addition of completions. > > > > What I sent you wasn't remotely like this, it had two nested locks, > > > the outer mutex and a very special open coded inner read/write lock > > > that doesn't deadlock when nested.. > > > > I still didn't like your approach, because it has three words which i don't want > > to see here: "special open-coded variant". I believe that the way to go is to add > > completion structure and convert _del to be something like that: > > Well, we can code the same idea with a completion, it is a little > clearer but uses more memory for the per-object completion struct. > > Read side: > mutex_lock(list_lock) > for_each_list(...,obj...) { // Must NOT use _safe here > // The object is in process of being deleted > if (refcount_inc_not_zero(&obj->refcount)) > continue; > mutex_unlock(list_lock); > ... > mutex_lock(list_lock); > if (refcount_dec_and_test(&obj->recount)) > complete(&res->completion); > } > > Destroy side: > if (refcount_dec_and_test(&obj->ref)) > complete(&obj->completion); > wait_for_completion(obj->free); > > mutex_lock(list_lock) > list_del(obj); > > > The refcount starts at 1 during init. Destroy triggers the freeing > process by decr. Once the refcount reaches 0 it latches at 0 as 'to be > destroyed' and the pointer can never leave the list_lock critical > section. > > It is still tricky.. This is more or less how nldev does reads, with exception that counters (kref) incremented and decremented under r/w semaphore, it eliminates the check "if (refcount_inc_not_zero(&obj->refcount))" and instead complete it calls to release (kref semantics). It is worth to read the nldev part too. The bug is that I didn't block rdma_restrack_del(), but the semantics and locking scheme is right. Thanks > > Jason
Attachment:
signature.asc
Description: PGP signature