Hi Leon, thanks for taking a look. Responses below. > On Jul 24, 2019, at 1:47 AM, Leon Romanovsky <leon@xxxxxxxxxx> wrote: > > On Tue, Jul 23, 2019 at 03:13:37PM -0400, Chuck Lever wrote: >> Send and Receive completion is handled on a single CPU selected at >> the time each Completion Queue is allocated. Typically this is when >> an initiator instantiates an RDMA transport, or when a target >> accepts an RDMA connection. >> >> Some ULPs cannot open a connection per CPU to spread completion >> workload across available CPUs. For these ULPs, allow the RDMA core >> to select a completion vector based on the device's complement of >> available comp_vecs. >> >> When a ULP elects to use RDMA_CORE_ANY_COMPVEC, if multiple CPUs are >> available, a different CPU will be selected for each Completion >> Queue. For the moment, a simple round-robin mechanism is used. >> >> Suggested-by: Håkon Bugge <haakon.bugge@xxxxxxxxxx> >> Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx> > > It make me wonder why do we need comp_vector as an argument to ib_alloc_cq? > From what I see, or callers are internally implementing similar logic > to proposed here, or they don't care (set 0). The goal of this patch is to deduplicate that "similar logic". Callers that implement this logic already can use RDMA_CORE_ANY_COMPVEC and get rid of their own copy. > Can we enable this comp_vector for everyone and simplify our API? We could create a new CQ allocation API that does not take a comp vector. That might be cleaner than passing in a -1. But I think some ULPs still want to use the existing API to allocate one CQ for each of a device's comp vectors. >> --- >> drivers/infiniband/core/cq.c | 20 +++++++++++++++++++- >> include/rdma/ib_verbs.h | 3 +++ >> net/sunrpc/xprtrdma/svc_rdma_transport.c | 6 ++++-- >> net/sunrpc/xprtrdma/verbs.c | 5 ++--- >> 4 files changed, 28 insertions(+), 6 deletions(-) >> >> Jason- >> >> If this patch is acceptable to all, then I would expect you to take >> it through the RDMA tree. >> >> >> diff --git a/drivers/infiniband/core/cq.c b/drivers/infiniband/core/cq.c >> index 7c599878ccf7..a89d549490c4 100644 >> --- a/drivers/infiniband/core/cq.c >> +++ b/drivers/infiniband/core/cq.c >> @@ -165,12 +165,27 @@ static void ib_cq_completion_workqueue(struct ib_cq *cq, void *private) >> queue_work(cq->comp_wq, &cq->work); >> } >> >> +/* >> + * Attempt to spread ULP completion queues over a device's completion >> + * vectors so that all available CPU cores can help service the device's >> + * interrupt workload. This mechanism may be improved at a later point >> + * to dynamically take into account the system's actual workload. >> + */ >> +static int ib_get_comp_vector(struct ib_device *dev) >> +{ >> + static atomic_t cv; >> + >> + if (dev->num_comp_vectors > 1) >> + return atomic_inc_return(&cv) % dev->num_comp_vectors; > > It is worth to take into account num_online_cpus(), I don't believe it is. Håkon has convinced me that assigning interrupt vectors to CPUs is in the domain of user space (ie, driven by policy). In addition, one assumes that taking a CPU offline properly will also involve re-assigning interrupt vectors that point to that core. In any event, this code can be modified after it is merged if it is necessary to accommodate such requirements. -- Chuck Lever