On Sat, Feb 18, 2023 at 09:23:47PM +0800, Yang Su wrote: > I do not understand why pci_bridge_wait_for_secondary_bus() can fix > Intel's Ponte Vecchio HPC GPU after a DPC-induced Hot Reset. > > The func pci_bridge_wait_for_secondary_bus() also use > pcie_wait_for_link_delay() which time depends on the max device delay > time of one bus, for the GPU which bus only one device, I think the > time is 100ms as the input parater in pcie_wait_for_link_delay(). > > pcie_wait_for_link() also wait fixed 100ms and then wait the device data > link is ready. So another wait time is pci_dev_wait() in your patch? > pci_dev_wait() to receive the CRS from the device to check the device > whether is ready. > > Please help me understand which difference work. The crucial difference is the invocation of pci_dev_wait(), which waits up to 60 seconds for the device to come out of reset. The spec allows 1 second but that may be extended via CRS. Ponte Vecchio has been witnessed to take more than 4 seconds in some cases, hence the need to wait longer than 1 second. Thanks, Lukas