On Wed, Sep 23, 2020 at 07:50:14PM +0300, Leon Romanovsky wrote: > From: Avihai Horon <avihaih@xxxxxxxxxx> > > Introduce rdma_query_gid_table which enables querying all the GID tables > of a given device and copying the attributes of all valid GID entries to > a provided buffer. > > This API provides a faster way to query a GID table using single call and > will be used in libibverbs to improve current approach that requires > multiple calls to open, close and read multiple sysfs files for a single > GID table entry. > > Signed-off-by: Avihai Horon <avihaih@xxxxxxxxxx> > Signed-off-by: Leon Romanovsky <leonro@xxxxxxxxxx> > drivers/infiniband/core/cache.c | 73 ++++++++++++++++++++++++- > include/rdma/ib_cache.h | 3 + > include/uapi/rdma/ib_user_ioctl_verbs.h | 8 +++ > 3 files changed, 81 insertions(+), 3 deletions(-) > > diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c > index cf49ac0b0aa6..211b88d17bc7 100644 > +++ b/drivers/infiniband/core/cache.c > @@ -1247,6 +1247,74 @@ rdma_get_gid_attr(struct ib_device *device, u8 port_num, int index) > } > EXPORT_SYMBOL(rdma_get_gid_attr); > > +/** > + * rdma_query_gid_table - Reads GID table entries of all the ports of a device up to max_entries. > + * @device: The device to query. > + * @entries: Entries where GID entries are returned. > + * @max_entries: Maximum number of entries that can be returned. > + * Entries array must be allocated to hold max_entries number of entries. > + * @num_entries: Updated to the number of entries that were successfully read. > + * > + * Returns number of entries on success or appropriate error code. > + */ > +ssize_t rdma_query_gid_table(struct ib_device *device, > + struct ib_uverbs_gid_entry *entries, > + size_t max_entries) > +{ > + const struct ib_gid_attr *gid_attr; > + ssize_t num_entries = 0, ret; > + struct ib_gid_table *table; > + unsigned int port_num, i; > + struct net_device *ndev; > + unsigned long flags; > + > + rdma_for_each_port(device, port_num) { > + if (!rdma_ib_or_roce(device, port_num)) > + continue; > + > + table = rdma_gid_table(device, port_num); > + read_lock_irqsave(&table->rwlock, flags); > + for (i = 0; i < table->sz; i++) { > + if (!is_gid_entry_valid(table->data_vec[i])) > + continue; > + if (num_entries >= max_entries) { > + ret = -EINVAL; > + goto err; > + } > + > + gid_attr = &table->data_vec[i]->attr; > + > + memcpy(&entries->gid, &gid_attr->gid, > + sizeof(gid_attr->gid)); > + entries->gid_index = gid_attr->index; > + entries->port_num = gid_attr->port_num; > + entries->gid_type = gid_attr->gid_type; > + rcu_read_lock(); > + ndev = rdma_read_gid_attr_ndev_rcu(gid_attr); This can't call rdma_read_gid_attr_ndev_rcu(), that also obtains the rwlock. rwlock can't be nested. Why didn't lockdep explode on this? This whole thing can just be: ndev = rcu_dereference_protected(gid_attr->ndev, lockdep_is_held(&table->rwlock)) if (ndev) entries->netdev_ifindex = ndev->ifindex; Jason