On Thu, May 3, 2018 at 6:19 PM, Rohit Zambre <rzambre@xxxxxxx> wrote: > An independent communication path is one that shares no hardware resources > with other communication paths. From a Verbs perspective, an independent > path is the one obtained by the first QP in a context. The next QPs of the > context may or may not share hardware resources amongst themselves; the > mapping of the resources to the QPs is provider-specific. Sharing resources > can hurt throughput in certain cases. When only one thread uses the > independent path, we term it an uncontended independent path. > > Today, the user has no way to request for an independent path for an > arbitrary QP within a context. To create multiple independent paths, the > Verbs user must create mulitple contexts with 1 QP per context. However, > this translates to significant hardware-resource wastage: 89% in the case > of the ConnectX-4 mlx5 device. > > This RFC patch allows the user to request for uncontended independent > communication paths in Verbs through an "independent" flag during Thread > Domain (TD) creation. The patch also provides a first-draft implementation > of uncontended independent paths in the mlx5 provider. > > In mlx5, every even-odd pair of TDs share the same UAR page, which is not > case when the user creates multiple contexts with one TD per context. When > the user requests for an independent TD, the driver will dynamically > allocate a new UAR page and map bfreg_0 of that UAR to the TD. bfreg_1 of > the UAR belonging to an independent TD is never used and is essentially > wasted. Hence, there must be a maximum number of independent paths allowed > within a context since the hardware resources are limited. This would be > half of the maximum number of dynamic UARs allowed per context. I'm not sure I follow what you're trying to achieve here on the mlx5 HW level. Are you assuming that two threads with seperate 'indep-comm-paths' using separate bfreg on the same UAR page causes some contention and performance hit in the mlx5 HW? We should first prove that's true, and then design a solution to solve it. Do you have benchmark results of any kind? When you create two seperate ibv_context you will separate a lot more then just the UAR pages on which the bfreg are mapped. Ehe entier software locking scheme is separated. The ibv_td object allows the user to separate resources so that locks could be managed in a smarter way in the provider lib data fast path. For that we allocate a bfreg for each ibv_td obj. Using a dedicated bfreg allows lower latency sends, as the doorbell does not need a lock to write the even/odd entries. At the time we did not extend the work to cover additional locks in mlx5. but it seems your series is targeting something else. Alex -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html