Hello, I tested all the 4 patches and they works great on my system (one that sets the completion on all writes to "slot control" register). Tested-by: Rajat Jain <rajatxjain@xxxxxxxxx> 02:01.0 PCI bridge: Integrated Device Technology, Inc. Device 807a (rev 02) (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0 Bus: primary=02, secondary=90, subordinate=9f, sec-latency=0 I/O behind bridge: 00002000-00002fff Memory behind bridge: 88000000-8bffffff Prefetchable memory behind bridge: 00000000b0200000-00000000b03fffff Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR- BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- Capabilities: [40] Express (v2) Downstream Port (Slot+), MSI 00 DevCap: MaxPayload 2048 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us ExtTag+ RBE+ FLReset- DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- MaxPayload 128 bytes, MaxReadReq 128 bytes DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend- LnkCap: Port #1, Speed 5GT/s, Width x4, ASPM L0s L1, Latency L0 <4us, L1 <4us ClockPM- Surprise+ LLActRep+ BwNot+ LnkCtl: ASPM Disabled; Disabled- Retrain- CommClk- ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s, Width x0, TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt- SltCap: AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug+ Surprise- Slot #1, PowerLimit 0.000W; Interlock- NoCompl- SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet+ CmdCplt+ HPIrq+ LinkChg+ Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock- SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet- Interlock- Changed: MRL- PresDet- LinkState- DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported ARIFwd+ DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled ARIFwd- LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1- EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest- Capabilities: [c0] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+ Address: 00000000fff41740 Data: 0001 Capabilities: [100 v1] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn- Capabilities: [200 v1] Virtual Channel Caps: LPEVC=0 RefClk=100ns PATEntryBits=4 Arb: Fixed- WRR32- WRR64- WRR128- Ctrl: ArbSelect=Fixed Status: InProgress- VC0: Caps: PATOffset=04 MaxTimeSlots=1 RejSnoopTrans- Arb: Fixed+ WRR32- WRR64- WRR128- TWRR128- WRR256- Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=ff Status: NegoPending- InProgress- Port Arbitration Table <?> Capabilities: [320 v1] Access Control Services ACSCap: SrcValid+ TransBlk+ ReqRedir+ CmpltRedir+ UpstreamFwd+ EgressCtrl+ DirectTrans+ ACSCtl: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans- Capabilities: [330 v1] #12 Kernel driver in use: pcieport Before these 4 patches, there was no timeout on my system but there was a spurious error message saying [ 26.324838] pciehp 0000:02:00.0:pcie24: Unexpected CMD_COMPLETED. Need to wait for command completed event which was gone with these patches. Thanks! Rajat On Sat, Jun 14, 2014 at 2:21 PM, Bjorn Helgaas <bhelgaas@xxxxxxxxxx> wrote: > We use incorrect logic to decide whether a PCIe hotplug controller > generates command completion events. > > 5808639bfa98 ("pciehp: fix slow probing") assumed that the Slot Status > "Command Completed" bit was set only for commands affecting slot power, > indicators, or electromechanical interlock. That assumption is false: per > sec. 6.7.3.2 of PCIe spec r3.0, a write targeting any portion of the Slot > Control register is a command, and (if command completed events are > supported) software must wait for a command to complete before issuing the > next command. > > 5808639bfa98 was to fix boot-time timeouts (see bugzilla below) on a Lenovo > Thinkpad R61 with an Intel hotplug controller. The controller probably has > the Intel CF118 erratum, which means it doesn't report Command Completed > unless the Slot Control power, indicator, or interlock bits are changed. > This causes a timeout because pciehp always waits for Command Complete (if > supported), regardless of which bits are changed. > > Remove the incorrect logic because the timeouts have been addressed > differently by these changes: > > PCI: pciehp: Wait for hotplug command completion lazily > PCI: pciehp: Compute timeout from hotplug command start time > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=10751 > Signed-off-by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx> > --- > drivers/pci/hotplug/pciehp_hpc.c | 39 ++++++++------------------------------ > 1 file changed, 8 insertions(+), 31 deletions(-) > > diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c > index 18ac24d84f9b..1f70de5359fb 100644 > --- a/drivers/pci/hotplug/pciehp_hpc.c > +++ b/drivers/pci/hotplug/pciehp_hpc.c > @@ -161,8 +161,10 @@ static void pcie_wait_cmd(struct controller *ctrl) > rc = wait_event_timeout(ctrl->queue, !ctrl->cmd_busy, timeout); > else > rc = pcie_poll_cmd(ctrl, timeout); > + > if (!rc) > - ctrl_dbg(ctrl, "Command not completed in 1000 msec\n"); > + ctrl_info(ctrl, "Timeout on hotplug command %#010x\n", > + ctrl->slot_ctrl); > } > > /** > @@ -174,7 +176,6 @@ static void pcie_wait_cmd(struct controller *ctrl) > static void pcie_write_cmd(struct controller *ctrl, u16 cmd, u16 mask) > { > struct pci_dev *pdev = ctrl_dev(ctrl); > - u16 slot_status; > u16 slot_ctrl; > > mutex_lock(&ctrl->ctrl_lock); > @@ -182,30 +183,6 @@ static void pcie_write_cmd(struct controller *ctrl, u16 cmd, u16 mask) > /* Wait for any previous command that might still be in progress */ > pcie_wait_cmd(ctrl); > > - pcie_capability_read_word(pdev, PCI_EXP_SLTSTA, &slot_status); > - if (slot_status & PCI_EXP_SLTSTA_CC) { > - pcie_capability_write_word(pdev, PCI_EXP_SLTSTA, > - PCI_EXP_SLTSTA_CC); > - if (!ctrl->no_cmd_complete) { > - /* > - * After 1 sec and CMD_COMPLETED still not set, just > - * proceed forward to issue the next command according > - * to spec. Just print out the error message. > - */ > - ctrl_dbg(ctrl, "CMD_COMPLETED not clear after 1 sec\n"); > - } else if (!NO_CMD_CMPL(ctrl)) { > - /* > - * This controller seems to notify of command completed > - * event even though it supports none of power > - * controller, attention led, power led and EMI. > - */ > - ctrl_dbg(ctrl, "Unexpected CMD_COMPLETED. Need to wait for command completed event\n"); > - ctrl->no_cmd_complete = 0; > - } else { > - ctrl_dbg(ctrl, "Unexpected CMD_COMPLETED. Maybe the controller is broken\n"); > - } > - } > - > pcie_capability_read_word(pdev, PCI_EXP_SLTCTL, &slot_ctrl); > slot_ctrl &= ~mask; > slot_ctrl |= (cmd & mask); > @@ -785,14 +762,14 @@ struct controller *pcie_init(struct pcie_device *dev) > mutex_init(&ctrl->ctrl_lock); > init_waitqueue_head(&ctrl->queue); > dbg_ctrl(ctrl); > + > /* > * Controller doesn't notify of command completion if the "No > - * Command Completed Support" bit is set in Slot Capability > - * register or the controller supports none of power > - * controller, attention led, power led and EMI. > + * Command Completed Support" bit is set in Slot Capabilities. > + * If set, it means the controller can accept hotplug commands > + * with no delay between them. > */ > - if (NO_CMD_CMPL(ctrl) || > - !(POWER_CTRL(ctrl) | ATTN_LED(ctrl) | PWR_LED(ctrl) | EMI(ctrl))) > + if (NO_CMD_CMPL(ctrl)) > ctrl->no_cmd_complete = 1; > > /* Check if Data Link Layer Link Active Reporting is implemented */ > -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html