Hi Doug, Thanks for the feedback. >>The ability to have RSS via hashed receives and multiple flows is a start, >>now you just need to add pinning the flows to specific sockets that are >>adjacent to the PCI bus the controller is on and populating the number >>of threads to use for RSS based upon the number of cores/threads per >>socket. I think that the proposed API and the overall solution allows that. >>I'm not sure I like the specific >>implementation. It puts a lot of new, complex configuration on the >>application. I think that it is important that verbs API, that comes to provide direct access to HW will explicitly expose RSS settings and allow verbs applications to adjust RSS configuration to their needs in order to get the 100% performance benefit from the HW capabilities. The ability to control RSS hash function and the Indirection table properties can be important for different fast packet processing solutions: - For example bump in the wire applications can prefer symmetric RSS so exposing interface that allows to configure the hash function is valuable. -The applications can know which traffic type it will handle and adjust which packets fields should participate in the RSS hash to better distribute the traffic. -Exposing API that allows to set RSS hash properties and RSS Indirection Table allows better verbs integration with such frameworks as DPDK. Notice that we do think that more "high level" RSS API that requires less changes in application level should be supported by rdma_cm. >>On other words, I would like to see an application >>get the benefit of this without having to be recoded. Maybe they only >>get a default benefit and not an optimized benefit, and maybe the user >>has to set an environment variable for libibverbs before it gets enabled >>by default, but I would still like to see a non-recompiled app benefit >>from this. I am not sure that I understand how does the “no recoding” approach can work: what “default” benefit can get application that works in polling mode? In polling mode application threads affinities define which CPU core are utilized for completions and packet processing. For example if application decides to open a single thread that polls CQ and process completions I don’t see how is it possible to better distribute the processing between CPU core without changing the application. For Interrupt driven applications we can get some partial “default” benefit: we can load balance Interrupt handler and “bottom half” part of packet processing but not the processing that is done in user space. Still this requires API changes or maybe adding some new environment parameter, as you have mentioned, since application on CQ creation provides comp_vector and running completion event handler on different cores (as the result of RSS) actually breaks this API. To summarize, in my opinion verbs is the right place to expose RSS interface that gives verbs application full control over different RSS properties and more "high level" and simple RSS API can be exposed via rdma_cm. This will allow from one side to answer the needs of solutions that requires the ability to directly access and configure the HW, as user space NICs solutions, and on other side this answer the needs of applications that look for more "high level" API. Thanks, AlexV -----Original Message----- From: Doug Ledford [mailto:dledford@xxxxxxxxxx] Sent: Saturday, May 30, 2015 12:57 AM To: RDMA mailing list Cc: Weiny, Ira; Matan Barak; Or Gerlitz; Haggai Eran; Yishai Hadas; Alex Vainman Subject: Current patch statuses * Ira's 3 patch const cleanup: generally approved, waiting latest version * Ira's 14 patch OPA set: I haven't reviewed yet, waiting for next version * Matan's 14 patch RoCE GID set: Changes were requested, awaiting next version * Or's 11 patch timestamp set: The general idea is OK, creating an extension is OK, but I think the actual uAPI needs nailed down a little more. Right now it's too vendor/model/driver specific for a general uAPI. * Haggai's 12 patch cma namespace set: Serious concerns over the suitability of creating more than one link per unique guid/pkey. Requesting that Haggai investigate using alias GUIDs for containers instead. * Or's 3 patch SRIOV set: both changes and coordination with netdev/iproute2 folks requested, awaiting results of that interaction and update for 8byte GUID management * Yishai's 3 patch hot removal set: changes in locking requested * Alex's Verbs RSS RFC: I read through the proposal, but I didn't research it in enough detail to provide high quality feedback. But I will give a little low quality feedback. First the good: I'm generally receptive to anything that improves our NUMA operation. While I didn't see that the RSS necessarily took NUMA into consideration, the overall framework looked like a good starting point for doing exactly that. The ability to have RSS via hashed receives and multiple flows is a start, now you just need to add pinning the flows to specific sockets that are adjacent to the PCI bus the controller is on and populating the number of threads to use for RSS based upon the number of cores/threads per socket. Next the questionable: I'm not sure I like the specific implementation. It puts a lot of new, complex configuration on the application. But the benefits of this could have a default setting without requiring application interaction. I'm not sure I wouldn't prefer something that, by default, creates all new QPs and CQs so that they go through a default hashing and redirection without any user interaction at all. Or maybe the user simply sets a flag to signal that they are OK with multithread completion handling and they get it. Or something like that. On other words, I would like to see an application get the benefit of this without having to be recoded. Maybe they only get a default benefit and not an optimized benefit, and maybe the user has to set an environment variable for libibverbs before it gets enabled by default, but I would still like to see a non-recompiled app benefit from this. The proposal I see here might be able to do that if the changes to libibverbs are done properly. And if the implementation is right it might be able to help the kernel consumers too. So, please keep these things in mind as you continue to work on this. And with that, I've got to do some Red Hat work for a bit. I won't be as responsive the early part of next week as a result. -- Doug Ledford <dledford@xxxxxxxxxx> GPG KeyID: 0E572FDD ��.n��������+%������w��{.n�����{���fk��ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f