On Thu, Jul 26, 2018 at 10:07:35AM +0300, Leon Romanovsky wrote: > From: Jack Morgenstein <jackm@xxxxxxxxxxxxxxxxxx> > > The upstream kernel commit cited below modified the workqueue in the > new CQ API to be bound to a specific CPU (instead of being unbound). > This caused ALL users of the new CQ API to use the same bound WQ. > > Specifically, MAD handling was severely delayed when the CPU bound > to the WQ was busy handling (higher priority) interrupts. > > This caused a delay in the MAD "heartbeat" response handling, > which resulted in ports being incorrectly classified as "down". > > To fix this, add a new "unbound" WQ type to the new CQ API, so that users > have the option to choose either a bound WQ or an unbound WQ. > > For MADs, choose the new "unbound" WQ. > > Fixes: b7363e67b23e ("IB/device: Convert ib-comp-wq to be CPU-bound") > Signed-off-by: Jack Morgenstein <jackm@xxxxxxxxxxxxxxxxxx> > Signed-off-by: Leon Romanovsky <leonro@xxxxxxxxxxxx> > --- > drivers/infiniband/core/cq.c | 23 +++++++++++++++++++++++ > drivers/infiniband/core/device.c | 15 ++++++++++++++- > drivers/infiniband/core/mad.c | 2 +- > include/rdma/ib_verbs.h | 8 +++++--- > 4 files changed, 43 insertions(+), 5 deletions(-) This seems pretty straightfoward to me.. But isn't it a bit strange overall that CQ processing works globally single threaded? If mad cares I bet everything else does too.. Is there some reason why we need to have a bound work queue at all? Sagi can you comment? Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html