On Wed, Feb 01, 2017 at 02:39:00PM +0200, Matan Barak wrote: > +static int uverbs_lock_object(struct ib_uobject *uobj, bool write) > +{ > + if (!write) > + return down_read_trylock(&uobj->currently_used) == 1 ? 0 : > + -EBUSY; > + > + /* lock is either WRITE or DESTROY - should be exclusive */ > + return down_write_trylock(&uobj->currently_used) == 1 ? 0 : -EBUSY; Is currently_used ever used as an actual blocking rwsem? Looks like no to me right now, is that the long term plan? It actually seems to be important that this not be blocking, so it might be better to code this a simple atomic. And lets call it uverbs_try_lock_object() for clarity. > +static void uverbs_uobject_put_ref(struct kref *ref) > +{ > + struct ib_uobject *uobj = > + container_of(ref, struct ib_uobject, ref); > + > + uobj->type->ops->release(uobj); > +} This sort of approach starts to become very dangerous when you contemplate having 'release' be a function in a module. Is that the case? But, broadly speaking, I expect to see kfree() in a release function, and from what I can tell the only issue here is that the IDR sometimes needs kfree_rcu. So lets drop the function pointer and just add a 1 bit flag: if (uobj->needs_rcu) kfree_rcu(uobj); else kfree(uobj); This is more clear, don't overdesign things at this point because it becomes hard to see if down the road things are done wrong (eg putting release in a different module) > +static void init_uobj(struct ib_uobject *uobj, struct ib_ucontext *context, > + const struct uverbs_obj_type *type) > +{ > + /* > + * user_handle should be filled by the handler, > + * The object is added to the list in the commit stage. > + */ > + init_rwsem(&uobj->currently_used); > + uobj->context = context; > + uobj->type = type; .. and can you please not add the bogus whitespace in new commits please? That is really not the typical kernel style and makes everything hard to maintain and read. > + kref_init(&uobj->ref); > +} Lets call this alloc_uobj, and just use kzalloc(type->obj_size) to do it. Due to the kref and the above comment about release it is not a good idea to have a type specific allocation, we always want kalloc'd memory here. > +static void uverbs_idr_remove_uobj(struct ib_uobject *uobj) > +{ > + spin_lock(&uobj->context->ufile->idr_lock); > + idr_remove(&uobj->context->ufile->idr, uobj->id); > + spin_unlock(&uobj->context->ufile->idr_lock); > +} Add a clarifying comment /* The caller must uverbs_uobject_put() uobj */ > + rcu_read_lock(); > + /* object won't be released as we're protected in rcu */ > + uobj = idr_find(&ucontext->ufile->idr, id); > + if (!uobj) { > + uobj = ERR_PTR(-ENOENT); > + goto free; > + } > + > + if (uobj->type != type) { > + uobj = ERR_PTR(-EINVAL); > + goto free; > + } > + > + ret = uverbs_lock_object(uobj, write); Hum. This RCU is a bit exciting.. So this relies on the write lock being held whenever uverbs_idr_remove is called. Can you add a LOCKDEP style of of assertion to uverbs_idr_remove to prove that? > +static struct ib_uobject *alloc_begin_idr_uobject(const struct uverbs_obj_type *type, > + struct ib_ucontext *ucontext) > +{ > + int ret; > + const struct uverbs_obj_idr_type *idr_type = > + container_of(type, struct uverbs_obj_idr_type, type); > + struct ib_uobject *uobj = kmalloc(idr_type->obj_size, GFP_KERNEL); > + > + if (!uobj) > + return ERR_PTR(-ENOMEM); This would be the thing to move to alloc_uobj. Ditto for fd. > + init_uobj(uobj, ucontext, type); > + ret = idr_add_uobj(uobj); > + if (ret) { > + kfree(uobj); This should be a uverbs_uobject_put() > +static void uverbs_uobject_add(struct ib_uobject *uobject) > +{ > + mutex_lock(&uobject->context->lock); > + list_add(&uobject->list, &uobject->context->uobjects); > + mutex_unlock(&uobject->context->lock); > +} I'd open code this into alloc_commit_idr_uobject. The only place the list should be updated is directly after it has been placed in the IDR, it is confusing to see it in a function as though it can be called from someplace else > +static void _put_uobj_ref(struct kref *ref) > +{ > + kfree(container_of(ref, struct ib_uobject, ref)); > +} > + > +static void alloc_abort_idr_uobject(struct ib_uobject *uobj) > +{ > + uverbs_idr_remove_uobj(uobj); > + /* > + * we don't need kfree_rcu here, as the uobject wasn't exposed to any > + * other verb. > + */ > + kref_put(&uobj->ref, _put_uobj_ref); > +} Once needs_rcu is added then this ugly stuff goes away. Set needs_rcu only when the uboject has been added to the IDR. > +static void destroy_commit_idr_uobject(struct ib_uobject *uobj) This is a confusing name.. See below.. > +{ > + uverbs_idr_remove_uobj(uobj); > + mutex_lock(&uobj->context->lock); > + list_del(&uobj->list); > + mutex_unlock(&uobj->context->lock); > + uverbs_uobject_put(uobj); > +} And this flow is weird, hot_unplug calls an idr_type op, but this does not? Why? > +void uverbs_cleanup_ucontext(struct ib_ucontext *ucontext, bool device_removed) > +{ > + unsigned int cur_order = 0; > + > + while (!list_empty(&ucontext->uobjects)) { > + struct ib_uobject *obj, *next_obj; > + unsigned int next_order = UINT_MAX; > + > + /* > + * This shouldn't run while executing other commands on this > + * context, thus no lock is required. > + */ > + list_for_each_entry_safe(obj, next_obj, &ucontext->uobjects, > + list) > + if (obj->type->destroy_order == cur_order) { > + list_del(&obj->list); > + obj->type->ops->hot_unplug(obj, device_removed); Let's add a safety: WARN_ON(uverbs_lock_object(obj, true)); It would be really nice to see a simpler 'built-in' scheme to guarentee that WARN_ON never hits. I expect this is relying on the external SRCU system to make it work out.. For instance since we now have lookup_get/lookup_put could we ultimately move the SRCU into those functions? Also, the call to hot_unplug has to be checked > + /* locking the uobjects_list */ > + struct mutex lock; Prefer uobjects_lock for clarity > @@ -1374,8 +1378,11 @@ struct ib_uobject { > int id; /* index into kernel idr */ > + > +struct uverbs_obj_type_ops { > + /* > + * Get an ib_uobject that corresponds to the given id from ucontext, > + * These functions could create or destroy objects if required. > + * The action will be finalized only when commit, abort or put fops are > + * called. > + * The flow of the different actions is: > + * [alloc]: Starts with alloc_begin. The handlers logic is than > + * executed. If the handler is successful, alloc_commit > + * is called and the object is inserted to the repository. > + * Otherwise, alloc_abort is called and the object is > + * destroyed. Add: Once alloc_commit completes the object is visible to other threads and userspace. > + * [lookup]: Starts with lookup_get which fetches and locks the > + * object. After the handler finished using the object, it > + * needs to call lookup_put to unlock it. The write flag > + * indicates if the object is locked for exclusive access. > + * [destroy]: Starts with lookup_get with write flag set. This locks > + * the object for exclusive access. If the handler code > + * completed successfully, destroy_commit is called and > + * the ib_uobject is removed from the context's uobjects > + * repository and put. Otherwise, lookup_put should be > + * called. Lets call this 'remove' please. And add: Once remove succeeds new krefs to the object cannot be acquired by other threads or userspace and the hardware driver is removed from the object. Other krefs on the object may still exist. > + * [hot_unplug]: Used when the context is destroyed (process > + * termination, reset flow). I don't think we need a dedicated entry point. I think you should add an enum argument to remove: enum rdma_remove_reason { RDMA_REMOVE_DESTROY, // Userspace requested uobject deletion RDMA_REMOVE_CLOSE, // Context deletion. Call cannot fail. RDMA_REMOVE_DRIVER_REMOVE, // Driver is being hot-unplugged. Call cannot fail. }; This gives the drivers a more information to the right thing to do. > + * [release]: Releases the memory of the ib_uobject. This is called > + * when the last reference is put. We favored a callback > + * here as this might require tricks like kfree_rcu. > + * This shouldn't be called explicitly by the user as it's > + * used by uverbs_uobject_put. Delete as discussed > + void (*destroy_commit)(struct ib_uobject *uobj); Use: /* Must be called with the write lock held. If successful uboj is invalid on return. On failure uobject is left completely unchanged */ int __must_check (*remove)(struct ib_uobject *uobj,enum rdma_remove_reason why); > +struct uverbs_obj_idr_type { > + /* > + * In idr based objects, uverbs_obj_type_ops points to a generic > + * idr operations. In order to specialize the underlying types (e.g. CQ, > + * QPs, etc.), we add obj_size and hot_unplug specific callbacks here. > + */ > + struct uverbs_obj_type type; > + size_t obj_size; Move to type, do not support no kalloc configurations > + /* The hot_unplug in ops initialized by idr, calls this callback */ > + void (*hot_unplug)(struct ib_uobject *uobj); Use: /* Free driver resources from the uobject, make the driver uncallable, and move the uobject to the detached state. On failure the uboject is left completely unchanged. */ int __must_check (*destroy)(struct ib_uobject *uobj,enum rdma_remove_reason why); And probably add some commentary what objects support a detached state. Any object that cannot be detached can only exist in the IDR and cannot have a kref taken. Perhaps we should enforce that directly for clarity. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html