Re: SSD surprise removal leads to long wait inside pci_dev_wait() and FLR 65s timeout

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[+cc Alex, Lukas]

On Fri, May 31, 2019 at 09:55:20AM -0700, JD Zheng wrote:
> Hello,
> 
> I am running DPDK 18.11+SPDK 19.04 with v5.1 kernel. DPDK/SPDK uses SSD vfio
> devices and after running SPDK's nvmf_tgt, unplugging a SSD cause kernel to
> print out following:
> [  105.426952] vfio-pci 0000:04:00.0: not ready 2047ms after FLR; waiting
> [  107.698953] vfio-pci 0000:04:00.0: not ready 4095ms after FLR; waiting
> [  112.050960] vfio-pci 0000:04:00.0: not ready 8191ms after FLR; waiting
> [  120.498953] vfio-pci 0000:04:00.0: not ready 16383ms after FLR; waiting
> [  138.418957] vfio-pci 0000:04:00.0: not ready 32767ms after FLR; waiting
> [  173.234953] vfio-pci 0000:04:00.0: not ready 65535ms after FLR; giving up
> 
> Looks like it is a PCI hotplug racing condition between DPDK's
> eal-intr-thread thread and kernel's pciehp thread. And it causes lockup in
> pci_dev_wait() at kernel side.
> 
> When SSD is removed, eal-intr-thread immediately receives
> RTE_INTR_HANDLE_ALARM and handler calls rte_pci_detach_dev() and at kernel
> side vfio_pci_release() is triggered to release this vfio device, which
> calls pci_try_reset_function(), then _pci_reset_function_locked().
> pci_try_reset_function acquires the device lock but
> _pci_reset_function_locked() doesn't return, therefore lock is NOT released.
> 
> Inside _pci_reset_function_locked(), pcie_has_flr(), pci_pm_reset(), etc.
> call pci_dev_wait() at the end but it doesn't return and print out above
> message until 65s timeout.
> 
> At kernel pciehp side, it also detects the removal but doesn't run
> immediately as it is configured as "pciehp.pciehp_poll_time=5". So a couple
> of seconds later, it calls pciehp_unconfigure_device -> pci_walk_bus ->
> pci_dev_set_disconnected. pci_dev_set_disconnected() couldn't get the device
> lock and is stuck too because the lock is hold by eal-intr-thread.
> 
> The first issue is in pci_dev_wait(). It calls pci_read_config_dword() and
> only when id is not all ones, it can return. But when SSD is physically
> removed, id retrieved is always all ones therefore, it has to wait for FLR
> 65s timeout to return.
> 
> I did the following to check return value of pci_read_config_dword() to fix
> this:
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -4439,7 +4439,11 @@ static int pci_dev_wait(struct pci_dev *dev, char
> *reset_type, int timeout)
> 
>                 msleep(delay);
>                 delay *= 2;
> -               pci_read_config_dword(dev, PCI_COMMAND, &id);
> +               if (pci_read_config_dword(dev, PCI_COMMAND, &id) ==
> +                   PCIBIOS_DEVICE_NOT_FOUND) {
> +                       pci_info(dev, "device disconnected\n");
> +                       return -ENODEV;
> +               }
>         }
> 
>         if (delay > 1000)
> 
> The second issue is that due to lock up described above, the
> pci_dev_set_disconnected() is stuck and pci_read_config_dword() won't return
> PCIBIOS_DEVICE_NOT_FOUND.
> 
> I didn't find a easy way to fix it. Maybe use device lock in
> pci_dev_set_disconnected() is too coarse and we need a finer device
> err_state lock?
> 
> BTW, pci_dev_set_disconnected wasn't using device lock until this change
> a6bd101b8f.
> 
> Any suggestions to fix this problem?

Would you mind opening a report at https://bugzilla.kernel.org and
attaching the complete dmesg log and "lspci -vv" output?

Out of curiosity, why do you use "pciehp.pciehp_poll_time=5"?

Bjorn



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux