On Mon, Jan 28, 2019 at 10:23:33PM -0700, Jason Gunthorpe wrote: > On Tue, Jan 22, 2019 at 08:15:48PM +0200, Leon Romanovsky wrote: > > > + /* > > + * Once all drivers are converted, we can remove this check > > + * and remove call to rdma_restrack_add() from rdma_restrack_kadd() > > + * and rdma_restrack_uadd() > > + */ > > + if (xa_load(xa, res->id)) > > + return 0; > > So we check if the ID is present.. Because the driver did something? > But no drivers do in this series, I guess? Don't get it. This patch annotated driver with rdma_restrcak_add() https://patchwork.kernel.org/patch/10782985/ It makes some paths with double calls to rdma_restrcak_add(), which this cxa_load() prevents. > > > kref_init(&res->kref); > > init_completion(&res->comp); > > res->valid = true; > > > > - ret = xa_insert(xa, res_to_id(res), res, GFP_KERNEL); > > - WARN_ONCE(ret == -EEXIST, "Tried to add non-unique type %d entry\n", > > - res->type); > > + if (rt[res->type].range.max) { > > + /* Not HW-capable device */ > > + ret = rt_xa_alloc_cyclic(xa, &res->id, res, > > + &rt[res->type].range, > > + &rt[res->type].next_id, GFP_KERNEL); > > + } else { > > + ret = xa_insert(xa, res->id, res, GFP_KERNEL); > > Then do insert I'm not fully understand the comment above, what did you mean? > > > + } > > + > > + /* > > + * WARNs below indicates an error in driver code. > > + * The check of -EEXIST is never occuried and added here > > + * to allow simple removal of xa_load above. > > + */ > > + WARN_ONCE(ret == -EEXIST, "Tried to add non-unique %s entry %u\n", > > + type2str(res->type), res->id); > > Then complain (which can't happen because we checked?) I want to remove xa_load() above and as I wrote in comment, those WARNS are not possible for now. > > > + WARN_ONCE(ret == -ENOSPC, > > + "There are no more free indexes for type %s entry %u\n", > > + type2str(res->type), res->id); > > + > > and this one can only happen for alloc_cyclic > > really confusing > > > if (ret) > > res->valid = false; > > + > > + return ret; > > } > > +EXPORT_SYMBOL(rdma_restrack_add); > > > > /** > > * rdma_restrack_kadd() - add kernel object to the reource tracking database > > @@ -300,10 +356,20 @@ static void rdma_restrack_add(struct rdma_restrack_entry *res) > > */ > > void rdma_restrack_kadd(struct rdma_restrack_entry *res) > > { > > + struct ib_device *dev = res_to_dev(res); > > + struct xarray *xa = rdma_dev_to_xa(dev, res->type); > > + > > res->task = NULL; > > set_kern_name(res); > > res->user = false; > > - rdma_restrack_add(res); > > + /* > > + * Temporaly, we are not intested in return value, > > + * once conversion will be finished, it will be checked by drivers. > > + */ > > + if (rdma_restrack_add(res)) > > + return; > > + > > + xa_set_mark(xa, res->id, RES_VISIBLE); > > And I have no idea what RES_VISIBLE is supposed to do. It is set in > two places, both places are the only callers of rdma_restrack_add > .. so every entry in the xarray has RES_VISIBLE set, except for a very > small time window after rdma_restrack_add returning? Why? It is supposed to prevent the situations where we called rdma_restrack_add on driver objects, but didn't configure all needed info yet, like task and kern_name. > > Also looks like rdma_restrack_add() should be made static since there > are no more callers.. The callers are coming https://patchwork.kernel.org/patch/10782985/ > > > @@ -347,7 +423,8 @@ rdma_restrack_get_byid(struct ib_device *dev, > > struct rdma_restrack_entry *res; > > > > res = xa_load(xa, id); > > - if (!res || xa_is_err(res) || !rdma_restrack_get(res)) > > xa_load doesn't return an xa_err as far as I can see.. Yes, it is mistake in rebase. I introduced it in "RDMA/restrack: Translate from ID to restrack object" and removed in wrong patch. > > > @@ -393,8 +469,7 @@ void rdma_restrack_del(struct rdma_restrack_entry *res) > > return; > > > > xa = rdma_dev_to_xa(dev, res->type); > > - id = res_to_id(res); > > - if (!xa_load(xa, id)) > > + if (!xa_load(xa, res->id)) > > goto out; > > Seems strange, isn't this sort of test what valid is supposed to be > used for? Why do we have valid, !NULL in the xarray and RES_VISIBLE > all at once? I need a way to distinguish between restrack entries without ib_device set and CM_IDs, see the comment above "if (!dev)" check. > > > rdma_restrack_put(res); > > @@ -402,7 +477,7 @@ void rdma_restrack_del(struct rdma_restrack_entry *res) > > wait_for_completion(&res->comp); > > > > down_write(&dev->res[res->type].rwsem); > > - xa_erase(xa, id); > > + xa_erase(xa, res->id); > > res->valid = false; > > up_write(&dev->res[res->type].rwsem); > > > > @@ -430,5 +505,6 @@ void rdma_rt_set_id_range(struct ib_device *dev, enum rdma_restrack_type type, > > > > rt[type].range.max = max; > > rt[type].range.min = reserved; > > + rt[type].next_id = reserved; > > } > > EXPORT_SYMBOL(rdma_rt_set_id_range); > > diff --git a/include/rdma/restrack.h b/include/rdma/restrack.h > > index 0706f51bb1f8..9d329098ddfc 100644 > > +++ b/include/rdma/restrack.h > > @@ -93,6 +93,11 @@ struct rdma_restrack_entry { > > * @user: user resource > > */ > > bool user; > > + /* > > + * @id: unique to specific type identifier, for HW-capable devices, > > + * drivers are supposed to update it, because it is used as an index. > > + */ > > + u32 id; > > }; > > > > int rdma_restrack_init(struct ib_device *dev); > > @@ -104,6 +109,14 @@ int rdma_restrack_count(struct ib_device *dev, > > void rdma_restrack_kadd(struct rdma_restrack_entry *res); > > void rdma_restrack_uadd(struct rdma_restrack_entry *res); > > > > +/** > > + * Addition of entry is performed in two steps approach: > > + * 1. Driver creates ID and allocates resource entry. > > + * 2. IB/core marks such entry as user/kernel and exports to nldev.c > > + */ > > This sounds like it is re-coding xa_reserve? Why can't the driver use > xa_reserve to hold its ID and we can just make the xarray value > non-NULL when it the entry is ready to go? Seems simpler if the driver > needs to rely on the retrack allocator for IDs.. Not really, I see restrack as a perfect id-to-object translator and not only allocator for IDs. It will allow to remove endless number of home-grown radix trees in drivers. This id-to-object translator requires that object != NULL for relevant ID. Thanks > > Jason
Attachment:
signature.asc
Description: PGP signature