Re: [PATCH rdma-next 4/7] IB/core: Refactor GID modify code for RoCE

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 28, 2018 at 03:43:41PM +0300, Leon Romanovsky wrote:
> From: Parav Pandit <parav@xxxxxxxxxxxx>
> 
> Code is refactored to prepare separate functions for RoCE which can do more
> complex operations related to reference counting, while still
> maintainining code readability. This includes
> (a) Simplification to not perform netdevice checks and modifications
> for IB link layer.
> (b) Do not add RoCE GID entry which has NULL netdevice; instead return
> an error.
> (c) If GID addition fails at provider level add_gid(), do not add the
> entry in the cache and keep the entry marked as INVALID.
> (d) Simplify and reuse the ib_cache_gid_add()/del() routines so that they
> can be used even for modifying default GIDs. This avoid some code
> duplication in modifying default GIDs.
> (e) find_gid() routine refers to the data entry flags to qualify a GID
> as valid or invalid GID rather than depending on attributes and zeroness
> of the GID content.
> (f) gid_table_reserve_default() sets the GID default attribute at
> beginning while setting up the GID table. There is no need to use
> default_gid flag in low level functions such as write_gid(), add_gid(),
> del_gid(), as they never need to update the DEFAULT property of the GID
> entry while during GID table update.

This is quite a lot to have going on in a single patch, but it all
seems inter-depending so OK..

> +static void del_roce_gid(struct ib_device *device, u8 port_num,
> +			 struct ib_gid_table *table, int ix)
>  	__releases(&table->rwlock) __acquires(&table->rwlock)
>  {
> +	pr_debug("%s device=%s port=%d index=%d gid %pI6\n", __func__,
> +		 device->name, port_num, ix,
> +		 table->data_vec[ix].gid.raw);
> +
> +	write_unlock_irq(&table->rwlock);
> +	if (rdma_cap_roce_gid_table(device, port_num))
> +		device->del_gid(device, port_num, ix,
> +				&table->data_vec[ix].context);
> +	dev_put(table->data_vec[ix].attr.ndev);
> +	write_lock_irq(&table->rwlock);
> +}

This doesn't look right.. The old code only did the release inside
rdma_cap_roce_gid_table() which maybe is how it avoided racing with
ib_cache_update (which doesn't run if !rdma_cap_roce_gid_table?
Unclear to me)

Overall this locking scheme is bad. But! After all the changes it is
pretty easy to fix it, I will send you something for your opinion.

> +static int add_roce_gid(struct ib_gid_table *table,
> +			const union ib_gid *gid,
> +			const struct ib_gid_attr *attr)
> +	__releases(&table->rwlock) __acquires(&table->rwlock)
> +{
[..]
> +	write_unlock_irq(&table->rwlock);
> +	if (rdma_cap_roce_gid_table(attr->device, attr->port_num)) {
> +		ret = attr->device->add_gid(attr->device, attr->port_num,
> +					    ix, gid, attr, &entry->context);
> +		if (ret) {
> +			pr_err("%s GID add failed device=%s port=%d index=%d\n",
> +			       __func__, attr->device->name, attr->port_num,
> +			       attr->index);
> +			goto add_err;
> +		}
>  	}

Ditto.

> +static int add_modify_gid(struct ib_gid_table *table,
> +			  const union ib_gid *gid,
> +			  const struct ib_gid_attr *attr)
> +{
> +	int ret = 0;
> +
> +	if (rdma_protocol_roce(attr->device, attr->port_num)) {
> +		ret = add_roce_gid(table, gid, attr);
> +	} else {
> +		/*
> +		 * Some HCA's report multiple GID entries with only one
> +		 * valid GID, but remaining as zero GID.
> +		 * So ignore such behavior for IB link layer and don't
> +		 * fail the call, but don't add such entry to GID cache.
> +		 */
> +		if (!memcmp(gid, &zgid, sizeof(*gid)))
> +			return 0;
> +	}
> +
> +	if (!ret) {
> +		memcpy(&table->data_vec[attr->index].gid, gid, sizeof(*gid));
> +		memcpy(&table->data_vec[attr->index].attr, attr, sizeof(*attr));
> +		table->data_vec[attr->index].props &= ~GID_TABLE_ENTRY_INVALID;
> +	}
> +	return ret;
>  }

The control flow is convoluted here, just return immediatelty if
add_roce_gid fails and drop the 'if (!ret)' and make 'return ret' into
'return 0'

> +		pr_warn("%s: unable to add gid %pI6 error=%d\n",
> +			__func__, gid->raw, ret);

Do we really need all these prints? Can userspace or worse, remote via
MADs, trigger console spamming with these?

Otherwise I think this patch is a welcomed improvement.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux