Ben Hutchings <ben@xxxxxxxxxxxxxxx> writes: > 3.2.49-rc1 review patch. If anyone has any objections, please let me know. > > ------------------ > > From: Gavin Shan <shangw@xxxxxxxxxxxxxxxxxx> > > commit 6d446ec32f169c6a5d9bc90684a8082a6cbe90f6 upstream. > > When the EEH error is the result of a fenced host bridge, MMIO accesses > can be very slow (milliseconds) to timeout and return all 1's, > thus causing the driver various timeout loops to take way too long and > trigger soft-lockup warnings (in addition to taking minutes to recover). > > It might be worthwhile to check if for any of these cases, ffffffff is > a valid possible value, and if not, bail early since that means the HW > is either gone or isolated. In the meantime, checking that the PCI channel > is offline would be workaround of the problem. > > Signed-off-by: Gavin Shan <shangw@xxxxxxxxxxxxxxxxxx> > Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx> > [bwh: Backported to 3.2: adjust context, indentation] > Signed-off-by: Ben Hutchings <ben@xxxxxxxxxxxxxxx> > --- > drivers/net/ethernet/broadcom/tg3.c | 36 ++++++++++++++++++++++++++++++++++++ > 1 file changed, 36 insertions(+) > > --- a/drivers/net/ethernet/broadcom/tg3.c > +++ b/drivers/net/ethernet/broadcom/tg3.c > @@ -688,6 +688,9 @@ static int tg3_ape_lock(struct tg3 *tp, > status = tg3_ape_read32(tp, gnt + off); > if (status == bit) > break; > + if (pci_channel_offline(tp->pdev)) > + break; > + > udelay(10); > } > > @@ -1465,6 +1468,9 @@ static void tg3_wait_for_event_ack(struc > for (i = 0; i < delay_cnt; i++) { > if (!(tr32(GRC_RX_CPU_EVENT) & GRC_RX_CPU_DRIVER_EVENT)) > break; > + if (pci_channel_offline(tp->pdev)) > + break; > + > udelay(8); > } > } > @@ -1628,6 +1634,9 @@ static int tg3_poll_fw(struct tg3 *tp) > for (i = 0; i < 200; i++) { > if (tr32(VCPU_STATUS) & VCPU_STATUS_INIT_DONE) > return 0; > + if (pci_channel_offline(tp->pdev)) > + return -ENODEV; > + > udelay(100); > } > return -ENODEV; > @@ -1638,6 +1647,15 @@ static int tg3_poll_fw(struct tg3 *tp) > tg3_read_mem(tp, NIC_SRAM_FIRMWARE_MBOX, &val); > if (val == ~NIC_SRAM_FIRMWARE_MBOX_MAGIC1) > break; > + if (pci_channel_offline(tp->pdev)) { > + if (!tg3_flag(tp, NO_FWARE_REPORTED)) { > + tg3_flag_set(tp, NO_FWARE_REPORTED); > + netdev_info(tp->dev, "No firmware running\n"); > + } > + > + break; > + } > + > udelay(10); > } > > @@ -3067,6 +3085,10 @@ static int tg3_halt_cpu(struct tg3 *tp, > tw32(offset + CPU_MODE, CPU_MODE_HALT); > if (tr32(offset + CPU_MODE) & CPU_MODE_HALT) > break; > + if (pci_channel_offline(tp->pdev)) > + return -EBUSY; > + if (pci_channel_offline(tp->pdev)) > + return -EBUSY; > } I believe you didn't want to have these two invocations to the pci_channel_offline() function. i guess you wanted to have one of these moved to the other branch of the 'if' statement. [ btw, I've just replied to an email by David S. Miller about his backport to 3.4 (and 3.2) of this commit. ] Cheers, -- Luis > > tw32(offset + CPU_STATE, 0xffffffff); @@ -7569,6 > +7591,14 @@ static int tg3_stop_block(struct tg3 *tp tw32_f(ofs, > val); > > for (i = 0; i < MAX_WAIT_CNT; i++) { > + if (pci_channel_offline(tp->pdev)) { > + dev_err(&tp->pdev->dev, > + "tg3_stop_block device offline, " > + "ofs=%lx enable_bit=%x\n", > + ofs, enable_bit); > + return -ENODEV; > + } > + > udelay(100); > val = tr32(ofs); > if ((val & enable_bit) == 0) > @@ -7592,6 +7622,13 @@ static int tg3_abort_hw(struct tg3 *tp, > > tg3_disable_ints(tp); > > + if (pci_channel_offline(tp->pdev)) { > + tp->rx_mode &= ~(RX_MODE_ENABLE | TX_MODE_ENABLE); > + tp->mac_mode &= ~MAC_MODE_TDE_ENABLE; > + err = -ENODEV; > + goto err_no_dev; > + } > + > tp->rx_mode &= ~RX_MODE_ENABLE; > tw32_f(MAC_RX_MODE, tp->rx_mode); > udelay(10); > @@ -7640,6 +7677,7 @@ static int tg3_abort_hw(struct tg3 *tp, > err |= tg3_stop_block(tp, BUFMGR_MODE, BUFMGR_MODE_ENABLE, silent); > err |= tg3_stop_block(tp, MEMARB_MODE, MEMARB_MODE_ENABLE, silent); > > +err_no_dev: > for (i = 0; i < tp->irq_cnt; i++) { > struct tg3_napi *tnapi = &tp->napi[i]; > if (tnapi->hw_status) > > -- > To unsubscribe from this list: send the line "unsubscribe stable" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html