On Fri, Dec 15, 2017 at 01:39:50PM -0500, Jingoo Han wrote: > On Thursday, December 14, 2017 5:58 PM, Bjorn Helgaas wrote: > > In the PCI config access path, the *_pcie_valid_device() functions in > > the dwc, altera, rockchip, and xilinx drivers all check whether the > > link is up. > > > > I think this is racy because the link may go down after we check but > > before we perform the config access. > > > > What would blow up if we removed the *_pcie_link_up() checks? > > The original intention is to avoid config access before link up. > > Also, I did not find any racy condition as you mentioned. > However, if you think that we need to prevent the racy condition, > someone can send a patch or add comments. Here's the race: pci_read_config_word ... dw_pcie_rd_conf dw_pcie_valid_device dw_pcie_link_up # returns true; link currently up <-- link goes down for unrelated reason dw_pcie_rd_other_conf dw_pcie_read readl # issue config read The readl() that issues the config read in the PCIe domain races with the link-down event. If the readl() completes first, everything works as expected. If the link-down happens first, I don't know what happens. This is the core of my question. The hardware *should* do something sane because link-down is an asynchronous event that is unrelated to the config read and may happen at any time. It's just part of the PCIe environment and the spec defines mechanisms for dealing with and reporting the situation. > > I'd like to either remove the checks or add comments about why the > > race is acceptable. If we've covered this before, I apologize. > > Adding a comment will keep me from pestering you about this again in > > the future. > > > > Bjorn >