I thought it would be merged into 5.12 release. A little disappointed :< , What can I do to help ? Thanks, Etan -----Original Message----- From: Kuppuswamy, Sathyanarayanan <sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx> Sent: Wednesday, April 28, 2021 8:40 AM To: Lukas Wunner <lukas@xxxxxxxxx>; Bjorn Helgaas <helgaas@xxxxxxxxxx>; Williams, Dan J <dan.j.williams@xxxxxxxxx> Cc: Zhao, Haifeng <haifeng.zhao@xxxxxxxxx>; Sinan Kaya <okaya@xxxxxxxxxx>; Raj, Ashok <ashok.raj@xxxxxxxxx>; Keith Busch <kbusch@xxxxxxxxxx>; linux-pci@xxxxxxxxxxxxxxx; Russell Currey <ruscur@xxxxxxxxxx>; Oliver O'Halloran <oohall@xxxxxxxxx>; Stuart Hayes <stuart.w.hayes@xxxxxxxxx>; Mika Westerberg <mika.westerberg@xxxxxxxxxxxxxxx> Subject: Re: [PATCH] PCI: pciehp: Ignore Link Down/Up caused by DPC Hi Bjorn, On 3/30/21 1:53 PM, Kuppuswamy, Sathyanarayanan wrote: >> Downstream Port Containment (PCIe Base Spec, sec. 6.2.10) disables >> the link upon an error and attempts to re-enable it when instructed >> by the DPC driver. >> >> A slot which is both DPC- and hotplug-capable is currently brought >> down by pciehp once DPC is triggered (due to the link change) and >> brought up on successful recovery. That's undesirable, the slot >> should remain up so that the hotplugged device remains bound to its >> driver. DPC notifies the driver of the error and of successful >> recovery in pcie_do_recovery() and the driver may then restore the device to working state. >> >> Moreover, Sinan points out that turning off slot power by pciehp may >> foil recovery by DPC: Power off/on is a cold reset concurrently to >> DPC's warm reset. Sathyanarayanan reports extended delays or failure >> in link retraining by DPC if pciehp brings down the slot. >> >> Fix by detecting whether a Link Down event is caused by DPC and >> awaiting recovery if so. On successful recovery, ignore both the >> Link Down and the subsequent Link Up event. >> >> Afterwards, check whether the link is down to detect surprise-removal >> or another DPC event immediately after DPC recovery. Ensure that the >> corresponding DLLSC event is not ignored by synthesizing it and >> invoking irq_wake_thread() to trigger a re-run of pciehp_ist(). >> >> The IRQ threads of the hotplug and DPC drivers, pciehp_ist() and >> dpc_handler(), race against each other. If pciehp is faster than >> DPC, it will wait until DPC recovery completes. >> >> Recovery consists of two steps: The first step (waiting for link >> disablement) is recognizable by pciehp through a set DPC Trigger >> Status bit. The second step (waiting for link retraining) is >> recognizable through a newly introduced PCI_DPC_RECOVERING flag. >> >> If DPC is faster than pciehp, neither of the two flags will be set >> and pciehp may glean the recovery status from the new PCI_DPC_RECOVERED flag. >> The flag is zero if DPC didn't occur at all, hence DLLSC events are >> not ignored by default. >> >> This commit draws inspiration from previous attempts to synchronize >> DPC with pciehp: >> >> By Sinan Kaya, August 2018: >> https://lore.kernel.org/linux-pci/20180818065126.77912-1-okaya@kernel >> .org/ >> >> By Ethan Zhao, October 2020: >> https://lore.kernel.org/linux-pci/20201007113158.48933-1-haifeng.zhao >> @intel.com/ >> >> By Sathyanarayanan Kuppuswamy, March 2021: >> https://lore.kernel.org/linux-pci/59cb30f5e5ac6d65427ceaadf1012b2ba8d >> bf66c.1615606143.git.sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx/ >> > Looks good to me. This patch fixes the reported issue in our environment. > > Reviewed-by: Kuppuswamy Sathyanarayanan > <sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx> > Tested-by: Kuppuswamy Sathyanarayanan > <sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx> Any update on this patch? is this queued for merge? One of our customers is looking for this fix. So wondering about the status. -- Sathyanarayanan Kuppuswamy Linux Kernel Developer