[PATCH for-next 10/14] IB/Hfi1: Read CCE Revision register to verify the device is responsive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Kamenee Arumugam <kamenee.arumugam@xxxxxxxxx>

When Hfi1 device is unresponsive, reading the RcvArrayCnt register
will return all 1's. This value is then used to remap chip's RcvArray.
The incorrect all ones value used in remapping RcvArray
will cause warn on as shown by trace below:

[<ffffffff81685eac>] dump_stack+0x19/0x1b
[<ffffffff81085820>] warn_slowpath_common+0x70/0xb0
[<ffffffff810858bc>] warn_slowpath_fmt+0x5c/0x80
[<ffffffff81065c29>] __ioremap_caller+0x279/0x320
[<ffffffff8142873c>] ? _dev_info+0x6c/0x90
[<ffffffffa021d155>] ? hfi1_pcie_ddinit+0x1d5/0x330 [hfi1]
[<ffffffff81065d62>] ioremap_wc+0x32/0x40
[<ffffffffa021d155>] hfi1_pcie_ddinit+0x1d5/0x330 [hfi1]
[<ffffffffa0204851>] hfi1_init_dd+0x1d1/0x2440 [hfi1]
[<ffffffff813503dc>] ? pci_write_config_word+0x1c/0x20

Read CCE revision register first to verify that WFR device is
responsive. If the read return "all ones", bail out from init
and fail the driver load.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@xxxxxxxxx>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@xxxxxxxxx>
Signed-off-by: Kamenee Arumugam <kamenee.arumugam@xxxxxxxxx>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@xxxxxxxxx>
---
 drivers/infiniband/hw/hfi1/chip.c |    7 -------
 drivers/infiniband/hw/hfi1/pcie.c |    8 ++++++++
 2 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/chip.c b/drivers/infiniband/hw/hfi1/chip.c
index 3929656..4b97581 100644
--- a/drivers/infiniband/hw/hfi1/chip.c
+++ b/drivers/infiniband/hw/hfi1/chip.c
@@ -15042,13 +15042,6 @@ struct hfi1_devdata *hfi1_init_dd(struct pci_dev *pdev,
 	if (ret < 0)
 		goto bail_cleanup;
 
-	/* verify that reads actually work, save revision for reset check */
-	dd->revision = read_csr(dd, CCE_REVISION);
-	if (dd->revision == ~(u64)0) {
-		dd_dev_err(dd, "cannot read chip CSRs\n");
-		ret = -EINVAL;
-		goto bail_cleanup;
-	}
 	dd->majrev = (dd->revision >> CCE_REVISION_CHIP_REV_MAJOR_SHIFT)
 			& CCE_REVISION_CHIP_REV_MAJOR_MASK;
 	dd->minrev = (dd->revision >> CCE_REVISION_CHIP_REV_MINOR_SHIFT)
diff --git a/drivers/infiniband/hw/hfi1/pcie.c b/drivers/infiniband/hw/hfi1/pcie.c
index c1c9829..87bd6b6 100644
--- a/drivers/infiniband/hw/hfi1/pcie.c
+++ b/drivers/infiniband/hw/hfi1/pcie.c
@@ -183,6 +183,14 @@ int hfi1_pcie_ddinit(struct hfi1_devdata *dd, struct pci_dev *pdev)
 		return -ENOMEM;
 	}
 	dd_dev_info(dd, "UC base1: %p for %x\n", dd->kregbase1, RCV_ARRAY);
+
+	/* verify that reads actually work, save revision for reset check */
+	dd->revision = readq(dd->kregbase1 + CCE_REVISION);
+	if (dd->revision == ~(u64)0) {
+		dd_dev_err(dd, "Cannot read chip CSRs\n");
+		goto nomem;
+	}
+
 	dd->chip_rcv_array_count = readq(dd->kregbase1 + RCV_ARRAY_CNT);
 	dd_dev_info(dd, "RcvArray count: %u\n", dd->chip_rcv_array_count);
 	dd->base2_start  = RCV_ARRAY + dd->chip_rcv_array_count * 8;

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux