Hi, all, My card is a Gen2 x8 device, plugged on DELL R710 (Intel 5520 IOH platform with Xeon X5560). There is a PLX 8624 Switch populated on card, 4 identical Endpoint devices are connected with the Switch, hierarchy is shown as below. OS is Suse10 (sp3), kernel version 2.6.16.60-0.54.5-smp. -[0000:00]-+-00.0 +-07.0-[0000:06-0c]----00.0-[0000:07-0c]--+-04.0-[0000:08]-- | +-05.0-[0000:09]----00.0 | +-06.0-[0000:0a]----00.0 | +-08.0-[0000:0b]----00.0 | \-09.0-[0000:0c]----00.0 I’m investigating a software recover mechanism based on Hot Reset. When fatal error is detected and reported from the card, I use Hot Reset to recover the card. To test the recover flow, I also use Hot Reset to break normal operation. Here is the sequence: 1. Turn off AER reporting in Root Complex by clearing “Root Error Command Register (0x2C) 2. Mask all Non-correctable Error in Root Port’s AER. 3. Turn off conventional PCI error reporting by clearing “SERR# Enable” in both “Command Register (0x04)” and “Bridge Control Register (0x3E)”. 4. Issue Hot Reset by writing “Secondary Bus Reset” bit in Root Port’s “Bridge Control Register” 5. Card driver detects transaction problem 6. Driver clears “Bus Master Enable” and polling “Transaction Pending bit” in both Root Port and Switch’s upstream port to wait existing transaction done. 7. Driver issues Hot Reset by writing “Secondary Bus Reset” bit in Root Port. 8. Driver performs post initialization after link up. Such iteration can go several rounds and link down will occur between Root Port and Switch’s Upstream port. I tried to modify the flow, before step 4, I added code to clear “Bus Master Enable” and “Transaction Pending bit” polling. But link down still occurs. I see there is “graceful” Hot Reset flow supported in kernel by calling some system functions. But it could be a big effort for the card driver to cooperate with that framework. So I took the shortcut. My question is: will such direct Hot Reset impact overall system functionality? Or is there any chance for IOH to disable link training after exiting from Hot Reset? Can IOH detects such Hot Reset even if I masked all Non-correctable Error in Root’s AER? Thank you very much! Best regards, Xin Meng -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html