On Mon, Apr 15, 2019 at 03:45:16PM +0200, Thierry Reding wrote: > On Mon, Apr 15, 2019 at 05:06:10PM +0530, Manikanta Maddireddy wrote: > > > > On 12-Apr-19 8:20 PM, Bjorn Helgaas wrote: > > > [+cc Jingoo, Gustavo (dwc maintainers), Ley (altera), Michal (xilinx)] > > > > > > On Fri, Apr 12, 2019 at 12:30:22PM +0530, Manikanta Maddireddy wrote: > > >> On 12-Apr-19 1:45 AM, Bjorn Helgaas wrote: > > >>> On Thu, Apr 11, 2019 at 10:33:47PM +0530, Manikanta Maddireddy wrote: > > >>>> Add PCIe link up check in config read and write callback functions > > >>>> before accessing endpoint config registers. > > >>>> static int tegra_pcie_config_read(struct pci_bus *bus, unsigned int devfn, > > >>>> int where, int size, u32 *value) > > >>>> { > > >>>> + struct tegra_pcie *pcie = bus->sysdata; > > >>>> + struct pci_dev *bridge; > > >>>> + struct tegra_pcie_port *port; > > >>>> + > > >>>> if (bus->number == 0) > > >>>> return pci_generic_config_read32(bus, devfn, where, size, > > >>>> value); > > >>>> > > >>>> + bridge = pcie_find_root_port(bus->self); > > >>>> + > > >>>> + list_for_each_entry(port, &pcie->ports, list) > > >>>> + if (port->index + 1 == PCI_SLOT(bridge->devfn)) > > >>>> + break; > > >>>> + > > >>>> + /* If there is no link, then there is no device */ > > >>>> + if (!tegra_pcie_link_status(port)) { > > >>> This is racy and you should avoid it if possible. The link could go down > > >>> between calling tegra_pcie_link_status() and issuing the config read/write. > > >>> > > >>> If your driver is to be reliable, it must be able to handle any bad > > >>> consequence of issuing that config read/write anyway, so I think it's > > >>> better if it doesn't even bother checking whether the link is up. > > >> This change is made based on similar check present in dwc driver > > >> dw_pcie_valid_device(), reasons for making this change in Tegra might > > >> differ dwc. > > > Yes, you won't be surprised to learn that I don't like the similar > > > checks in dwc, altera, xilinx, and xilinx-nwl either :) I raise this > > > issue every time I see it, but I can't remember if I've mentioned dwc > > > specifically. > > > > > > We need to either eradicate this pattern of checking for link up, or > > > include a comment about why it is absolutely necessary. > > > > This patch is created to address below scenario in our downstream kernel, > > 1) Our platform has WiFi on one slot and GPU in another. > > 2) During WiFi OFF, link is put in L2 and it goes through hot reset > > when turning ON WiFi (since Tegra doesn't support hot-plug). > > 3) Whenever x11 server is started it scans the PCIe bus for video devices. > > Here PCIe configuration registers of all devices are read to find out > > all available video devices. > > 4) If "x11 server" started with WiFi OFF, then we are seeing "response > > decoding error"(Tegra AFI module specific error). > > > > Best solution we came up with is to have link up check in config access > > callback functions. > > So we really need this to prevent a userspace access to PCI config space > from triggering these errors? I'm not familiar with how PCI access from > userspace works, but if modifying the accessors fixes this problem it > sounds like userspace would end up calling these accessors. If so, it > sounds more like we should fix this at the point where userspace calls > these accessors. According to what you're saying this should never be an > issue from kernel space, because as long as a driver needs access to its > device, the PCI bus should be up. > > And if that wasn't the case, then we probably do want to see these AER > errors to help diagnose the issue. > > So could we instead have some sort of host bridge operation that would > expose the link status and use that as part of the userspace access to > PCI configuration space? Looks like maybe pci_user_read_config_*() would be a good place to check for this? They're defined by the PCI_USER_READ_CONFIG() macro in drivers/pci/access.c. Same for pci_user_write_config_*(). Thierry
Attachment:
signature.asc
Description: PGP signature