Re: [Ilw] Intel Wireless 7260 hardware timed out randomly

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Nov 09, 2013 at 10:46:21AM +0800, wzyboy wrote:
> 2013/11/9 Bjorn Helgaas <bhelgaas@xxxxxxxxxx>:
> > Thanks.  But can you please attach the output of "lspci -vvxxx" (not
> > "-vxxxx") for the entire system before the problem occurs?
> 
> 
> Sorry I used the wrong command...
> 
> I've attached the output of -vvxxx below.
> 
> There are three files:
> 
> * lspci.vvxxx.normal.txt: When the interface is "state DOWN" in "ip link".
> * lspci.vvxxx.normal2.txt: When the interface is "state UP" in "ip
> link" after I ran "ip link set wlan0 up".
> * lspci.vvxxx.normal3.txt" When the interface is connected to the
> Wi-Fi of my dormitory and got an address (but without default
> gateway, I'm using wired network now).

The only interesting difference is this (between "normal" and "normal3"):

--- lspci.vvxxx.normal.txt      2013-11-11 14:42:14.000000000 -0700
+++ lspci.vvxxx.normal3.txt     2013-11-11 14:42:14.000000000 -0700

 00:1c.1 PCI bridge: Intel Corporation Lynx Point-LP PCI Express Root Port 3 (rev e4) (prog-if 00 [Normal decode])
-               LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt-
+               LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train+ SlotClk+ DLActive+ BWMgmt+ ABWMgmt-

In "normal3", the Link Training bit is set.  I'm not a hardware person,
but my guess it this might be normal.  The spec says Link Training
indicates that the "LTSSM is in the Configuration or Recovery state,"
and Figure 5-1 shows that the transition from L1 to L0 goes through
the Recovery state.  So we might just be seeing the device returning
from L1 to L0.  Maybe Emmanuel can confirm this with the hardware guys.

Comparing "lspci.vvxxx.normal.txt" with "lspci.vvxxx.patched.bug.txt",
I see these changes in the 00:1c.1 Downstream Port (the bridge that
leads to the 7260 NIC):

--- before      2013-11-11 15:24:04.755738964 -0700
+++ after       2013-11-11 15:24:11.875722068 -0700
 00:1c.1 PCI bridge: Intel Corporation Lynx Point-LP PCI Express Root Port 3 (rev e4) (prog-if 00 [Normal decode])
-               DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
+               DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
-               LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt-
+               LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt+ ABWMgmt-
-               SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+ Interlock-
-                       Changed: MRL- PresDet- LinkState+
+               SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet- Interlock-
+                       Changed: MRL- PresDet+ LinkState+
-               DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Disabled ARIFwd-
+               DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled ARIFwd-

So when the bug occurs,

  - Correctable Error Detected is set
  - Data Link Layer Link Active is cleared
  - Presence Detect State is cleared
  - LTR Mechanism Enable is cleared (spec says this bit must be
    reset to the default value when a Downstream Port goes to
    DL_Down)

This all seems consistent with the device being powered off.  Maybe
the 7260 is on a daughterboard with a bad connection to the system
board?  Any chance you can open up the box and make sure the
connection is tight?

It's possible there's some ASPM issue, but I would think Presence
Detect would still work even if the 7260 had a problem with ASPM.
Here's another experiment to try to rule out ASPM.  Run these
commands as root after the driver is loaded but before the bug occurs:

  setpci -s03:00.0 0x50.W=0x140
  setpci -s00:1c.1 0x50.W=0x040
  lspci -vv

This should disable ASPM completely on that link, and the lspci output
will help verify that.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux