On Fri, Jan 27, 2023 at 03:57:16PM +0530, Kumar, M Chetan wrote: > On 1/26/2023 8:55 PM, Bjorn Helgaas wrote: > > On Tue, Jan 24, 2023 at 08:45:43PM -0800, Jakub Kicinski wrote: > > > On Sat, 21 Jan 2023 19:03:23 +0530 m.chetan.kumar@xxxxxxxxxxxxxxx wrote: > > > > From: M Chetan Kumar <m.chetan.kumar@xxxxxxxxxxxxxxx> > > > > > > > > PCI rescan module implements "rescan work queue". > > > > In firmware flashing or coredump collection procedure > > > > WWAN device is programmed to boot in fastboot mode and > > > > a work item is scheduled for removal & detection. > > > > > > > > The WWAN device is reset using APCI call as part driver > > > > removal flow. Work queue rescans pci bus at fixed interval > > > > for device detection, later when device is detect work queue > > > > exits. > > > > I'm not sure what's going on here. Do we need to reset the device > > when the t7xx driver is loaded so the device will load new > > firmware when it comes out of reset? > > Flow is, Reset the device to get into firmware download mode then > update the firmware and later reset it to go back to normal mode. Thanks, that makes sense, and I'm confident that t7xx is not the only driver that needs to do something like this, so we should be able to figure out a nice way to do it. > > > > +void t7xx_pci_dev_rescan(void) > > > > +{ > > > > + struct pci_bus *b = NULL; > > > > + > > > > + pci_lock_rescan_remove(); > > > > + while ((b = pci_find_next_bus(b))) > > > > + pci_rescan_bus(b); > > > > No, this driver absolutely cannot rescan and assign unassigned > > resources for all the PCI buses in the system. > > T7xx device falls off the bus due to ACPI reset. > Would you please suggest how we can bring device back on the bus > without such changes inside driver ? Will pci_reset_function() help > in this regard ? "Falling off the bus" is not very precise language, but it usually means the device stops responding to PCI transactions like config, memory, or I/O accesses. Any kind of reset, whether it's done via ACPI _RST or one of the mechanisms used by pci_reset_function(), causes a PCI device to stop responding temporarily. When the device exits reset, it does some internal initialization and eventually becomes ready to respond to PCI transactions again. The PCI core doesn't do anything to the device to "bring it back on the bus" other than powering on the device or deasserting whatever signal initiated the reset in the first place. For example, if we do the reset via pci_reset_secondary_bus(), we set the Secondary Bus Reset (PCI_BRIDGE_CTL_BUS_RESET) bit in a bridge, which triggers a reset for devices below the bridge. When we clear Secondary Bus Reset, those devices reinitialize themselves and start responding to PCI transactions. pci_reset_secondary_bus() contains a ssleep(1) to give the device time for that initialization. The t7xx_pci_dev_rescan() loop calls pci_rescan_bus(), which does nothing to bring devices back on the bus. It merely issues config reads to all the possible device addresses to see which respond. If t7xx_pci_dev_rescan() seems to bring the device back on the bus, it is probably simply because it takes time and gives the device time to finish its initialization. It doesn't actually *do* anything to the device other than do a config read to it. I notice that t7xx_remove_rescan() would actually *remove* the pci_dev, and pci_rescan_bus() would create a new pci_dev if the t7xx device responds to a config read. But this a real mess. When you remove the device, the driver is detached from it, and we should no longer be running any driver code. If you can use pci_reset_function(), there's no need to remove and re-enumerate the device, so that should let you get rid of the whole t7xx_remove_rescan() workqueue. Bjorn