On Wed, Sep 05, 2018 at 11:45:45AM +0530, Sreekanth Reddy wrote: > On Tue, Sep 4, 2018 at 3:12 PM, Lukas Wunner <lukas@xxxxxxxxx> wrote: > > Many scsi drivers call pci_channel_offline() to detect inaccessibility > > of the device due to a PCI error: > > https://elixir.bootlin.com/linux/v4.19-rc2/ident/pci_channel_offline > > > > A patch is pending such that surprise removal can also be queried > > with that same function: > > https://www.spinics.net/lists/linux-pci/msg75722.html > > Lukas, thanks for pointing to this pci_dev_is_disconnected() API. So > we can use this API directly instead of reading the vendor Id and > checking for all one's once this patch get accepted? Yes, except pci_dev_is_disconnected() is private to the PCI core, but dev->error_state and pci_channel_offline() is public. > I have one more instance where still we need this poll kthread, i.e > during the device probe time we have some commands which has more > timeout value (e.g. 300 seconds), so if user has unplugged this device > just after sending this more time-out valued command then driver has > to wait until this time-out value expires. i.e. this device is still > visible in lspci output until this 300 seconds timeout value expires > even though device is unplugged. if we have a poll kthread (which will > poll for every one second) then driver can early detect the unplugged > state and can terminate the outstanding commands and hence probe > operation can be completed quickly. The only instances I can see in your driver where it waits for 300 s is in _base_diag_reset(), which does an msleep(256) in a loop for up to 300 s, and scsih_scan_finished(), which is called in a loop with an msleep(10) by do_scsi_scan_host(). Any harm in simply checking for removal of the device in those loops and bailing out if so? Instead of the poll kthread to achieve the same? > Also whether we need to wait for below patches get accepted? so that > we can post the new set of patches accommodating according to your > suggestions, > https://www.spinics.net/lists/linux-pci/msg75722.html > https://www.spinics.net/lists/linux-pci/msg75438.html I can't tell you whether and when those patches get accepted, that's for Bjorn Helgaas to decide. Also, what does get accepted might differ. E.g. the first patch uses the existing pci_channel_io_perm_failure state for removal. There was debate whether to introduce a new state for removed devices to avoid overloading the existing state, which is used for error recovery. All I can recommend is to follow linux-pci, test the patches that have already been brought forward to ascertain they fulfil your needs, and generally participate in the debate so that your use cases are covered. Thanks, Lukas