Re: [RFC PATCH] verbs: Introduce mlx5: Implement uncontended independent communication paths

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, May 3, 2018 at 6:19 PM, Rohit Zambre <rzambre@xxxxxxx> wrote:
> An independent communication path is one that shares no hardware resources
> with other communication paths. From a Verbs perspective, an independent
> path is the one obtained by the first QP in a context. The next QPs of the
> context may or may not share hardware resources amongst themselves; the
> mapping of the resources to the QPs is provider-specific. Sharing resources
> can hurt throughput in certain cases. When only one thread uses the
> independent path, we term it an uncontended independent path.
>
> Today, the user has no way to request for an independent path for an
> arbitrary QP within a context. To create multiple independent paths, the
> Verbs user must create mulitple contexts with 1 QP per context. However,
> this translates to significant hardware-resource wastage: 89% in the case
> of the ConnectX-4 mlx5 device.
>
> This RFC patch allows the user to request for uncontended independent
> communication paths in Verbs through an "independent" flag during Thread
> Domain (TD) creation. The patch also provides a first-draft implementation
> of uncontended independent paths in the mlx5 provider.
>
> In mlx5, every even-odd pair of TDs share the same UAR page, which is not
> case when the user creates multiple contexts with one TD per context. When
> the user requests for an independent TD, the driver will dynamically
> allocate a new UAR page and map bfreg_0 of that UAR to the TD. bfreg_1 of
> the UAR belonging to an independent TD is never used and is essentially
> wasted. Hence, there must be a maximum number of independent paths allowed
> within a context since the hardware resources are limited. This would be
> half of the maximum number of dynamic UARs allowed per context.

I'm not sure I follow what you're trying to achieve here on the mlx5 HW level.
Are you assuming that two threads with seperate 'indep-comm-paths'
using separate bfreg on the same UAR page causes some contention and
performance hit in the mlx5 HW?
We should first prove that's true, and then design a solution to solve it.
Do you have benchmark results of any kind?

When you create two seperate ibv_context you will separate a lot more
then just the UAR pages on which the bfreg are mapped. Ehe entier
software locking scheme is separated.

The ibv_td object allows the user to separate resources so that locks
could be managed in a smarter way in the provider lib data fast path.
For that we allocate a bfreg for each ibv_td obj. Using a dedicated
bfreg allows lower latency sends, as the doorbell does not need a lock
to write the even/odd entries.
At the time we did not extend the work to cover additional locks in
mlx5. but it seems your series is targeting something else.

Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux