RE: [for-rc v3] RDMA/bnxt_re: Fix broken RoCE driver due to recent L2 driver changes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



>-----Original Message-----
>From: linux-rdma-owner@xxxxxxxxxxxxxxx [mailto:linux-rdma-
>owner@xxxxxxxxxxxxxxx] On Behalf Of Devesh Sharma
>Sent: Friday, May 25, 2018 12:01 PM
>To: dledford@xxxxxxxxxx; jgg@xxxxxxxxxxxx
>Cc: linux-rdma@xxxxxxxxxxxxxxx; jtoppins@xxxxxxxxxx; ddutile@xxxxxxxxxx;
>Devesh Sharma <devesh.sharma@xxxxxxxxxxxx>
>Subject: [for-rc v3] RDMA/bnxt_re: Fix broken RoCE driver due to recent L2
>driver changes
>
>The recent changes in Broadcom's ethernet driver(L2 driver) broke
>RoCE functionality in terms of MSIx vector allocation and
>de-allocation.
>
>There is a possibility that L2 driver would initiate MSIx vector
>reallocation depending upon the requests coming from administrator.
>In such cases L2 driver needs to free up all the MSIx vectors
>allocated previously and reallocate/initialize those.
>
>If RoCE driver is loaded and reshuffling is attempted, there will be
>kernel crashes because RoCE driver would still be holding the MSIx
>vectors but L2 driver would attempt to free in-use vectors. Thus
>leading to a kernel crash.
>
>Making changes in roce driver to fix crashes described above.
>As part of solution L2 driver tells RoCE driver to release
>the MSIx vector whenever there is a need. When RoCE driver
>get message it sync up with all the running tasklets and IRQ
>handlers and releases the vectors. L2 driver send one more
>message to RoCE driver to resume the MSIx vectors. L2 driver
>guarantees that RoCE vector do not change during reshuffling.
>
>Fixes: ec86f14ea506 ("bnxt_en: Add ULP calls to stop and restart IRQs.")
>Fixes: 08654eb213a8 ("bnxt_en: Change IRQ assignment for RDMA driver.")
>Signed-off-by: Devesh Sharma <devesh.sharma@xxxxxxxxxxxx>
>---
> drivers/infiniband/hw/bnxt_re/main.c       | 55 ++++++++++++++++-
> drivers/infiniband/hw/bnxt_re/qplib_fp.c   | 94 +++++++++++++++++++-----
>------
> drivers/infiniband/hw/bnxt_re/qplib_fp.h   |  3 +
> drivers/infiniband/hw/bnxt_re/qplib_rcfw.c | 61 +++++++++++++------
> drivers/infiniband/hw/bnxt_re/qplib_rcfw.h |  3 +
> 5 files changed, 163 insertions(+), 53 deletions(-)
>
>--- a/drivers/infiniband/hw/bnxt_re/qplib_fp.c
>+++ b/drivers/infiniband/hw/bnxt_re/qplib_fp.c
>@@ -336,22 +336,32 @@ static irqreturn_t bnxt_qplib_nq_irq(int irq, void
>*dev_instance)
> 	return IRQ_HANDLED;
> }
>
>+void bnxt_qplib_nq_stop_irq(struct bnxt_qplib_nq *nq, bool kill)
>+{
>+	tasklet_disable(&nq->worker);
>+	/* Mask h/w interrupt */
>+	NQ_DB(nq->bar_reg_iomem, nq->hwq.cons, nq-
>>hwq.max_elements);
>+	/* Sync with last running IRQ handler */
>+	synchronize_irq(nq->vector);
>+	if (kill)
>+		tasklet_kill(&nq->worker);
>+	if (nq->requested) {
>+		irq_set_affinity_hint(nq->vector, NULL);
>+		free_irq(nq->vector, nq);
>+		nq->requested = false;
>+	}
>+}
>+
> void bnxt_qplib_disable_nq(struct bnxt_qplib_nq *nq)
> {
> 	if (nq->cqn_wq) {
> 		destroy_workqueue(nq->cqn_wq);
> 		nq->cqn_wq = NULL;
> 	}
>+
> 	/* Make sure the HW is stopped! */
>-	synchronize_irq(nq->vector);
>-	tasklet_disable(&nq->worker);
>-	tasklet_kill(&nq->worker);
>+	bnxt_qplib_nq_stop_irq(nq, true);
>
>-	if (nq->requested) {
>-		irq_set_affinity_hint(nq->vector, NULL);
>-		free_irq(nq->vector, nq);
>-		nq->requested = false;
>-	}
> 	if (nq->bar_reg_iomem)
> 		iounmap(nq->bar_reg_iomem);
> 	nq->bar_reg_iomem = NULL;
>@@ -361,6 +371,40 @@ void bnxt_qplib_disable_nq(struct bnxt_qplib_nq
>*nq)
> 	nq->vector = 0;
> }
>
>+int bnxt_qplib_nq_start_irq(struct bnxt_qplib_nq *nq, int nq_indx,
>+			    int msix_vector, bool need_init)
>+{
>+	int rc;
>+
>+	if (nq->requested)
>+		return -EFAULT;
>+
>+	nq->vector = msix_vector;
>+	if (need_init)
>+		tasklet_init(&nq->worker, bnxt_qplib_service_nq,
>+			     (unsigned long)nq);
>+	else
>+		tasklet_enable(&nq->worker);
>+
>+	snprintf(nq->name, sizeof(nq->name), "bnxt_qplib_nq-%d",
>nq_indx);
>+	rc = request_irq(nq->vector, bnxt_qplib_nq_irq, 0, nq->name, nq);

Just a suggestion:

Have you look into the pci_request_irq()/pci_free_irq() interface?

This interface removes the snprintf() (it is incorporated in the interface so you do
not need to keep the name), and cleans up the usage of the vector and PCI IRQ
information.

M
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux