RE: [PATCH v12 5/5] iommu/vt-d: improve ITE fault handling if target device isn't present

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> From: Ethan Zhao <haifeng.zhao@xxxxxxxxxxxxxxx>
> Sent: Tuesday, January 30, 2024 5:13 PM
> 
> On 1/30/2024 4:43 PM, Tian, Kevin wrote:
> >> From: Ethan Zhao <haifeng.zhao@xxxxxxxxxxxxxxx>
> >> Sent: Tuesday, January 30, 2024 4:16 PM
> >>
> >> On 1/30/2024 2:22 PM, Tian, Kevin wrote:
> >>> Here we need consider two situations.
> >>>
> >>> One is that the device is not bound to a driver or bound to a driver
> >>> which doesn't do active work to the device when it's removed. In
> >>> that case one may observe the timeout situation only in the removal
> >>> path as the stack dump in your patch02 shows.
> >> When iommu_bus_notifier() got called for hotplug removal cases to
> >> flush devTLB (ATS invalidation), driver was already unloaded.
> >> whatever safe removal or surprise removal. so in theory no active
> >> driver working there.
> >>
> >> pciehp_ist()
> >>    pciehp_disable_slot()
> >>     remove_board()
> >>      pciehp_unconfigure_device()
> >>       pci_stop_and_remove_bus_device()
> >>        pci_stop_bus_device()--->here unload driver
> >>        pci_remove_bus_device()->here qi_flush_dev_iotlb() got called.
> > yes, so patch02 can fix this case.
> >
> >>> patch02 can fix that case by checking whether the device is present
> >>> to skip sending the invalidation requests. So the logic being discussed
> >>> here doesn't matter.
> >>>
> >>> The 2nd situation is more tricky. The device might be bound to
> >>> a driver which is doing active work to the device with in-fly
> >>> ATS invalidation requests. In this case in-fly requests must be aborted
> >>> before the driver can be detached from the removed device.
> Conceptually
> >>> a device is removed from the bus only after its driver is detached.
> >> Some tricky situations:
> >>
> >> 1. The ATS invalidation request is issued from driver driver, while it is
> >> in handling, device is removed. this momment, the device instance still
> >> exists in the bus list. yes, if searching it by BDF, could get it.
> > it's searchable between the point where the device is removed and the
> > point where the driver is unloaded:
> >
> >          CPU0                                CPU1
> >    (Driver is active)                    (pciehp handler)
> >    qi_submit_sync()                      pciehp_ist()
> >      ...                                   ...
> >      loop for completion() {               pciehp_unconfigure_device()
> >        ...                                   pci_dev_set_disconnected()
> >        if (ITE) {                            ...
> >          //find pci_dev from sid             pci_remove_bus_device()
> >          if (pci_dev_is_connected())           device_del()
> >            break;                                bus_remove_device()
> >        }                                           device_remove_driver()
> 
> If the device was hot plugin or re-scanned, the device has a PCI_DEV_ADDED
> flag,

in this case is pci_dev_is_disconnected() true or false? 

how is this patch supposed to work with it?

> if so the driver unloading work isn't defered to the tail of device_del(), it
> is unloaded before pci_remove_bus_device()->device_del(), in pci_stop_dev
> 
> pci_stop_bus_device()
>   pci_stop_dev()
>   {
>    if (pci_dev_is_added(dev)) {
>        device_release_driver(&dev->dev);
>   }

no matter where driver unload is requested, it needs to wait for aborting
in-fly request on CPU0.

> 
> So the interval the device is searchable, only applied to those devices
> not hot plugged, or never be scanned.
> 

and in the worst case even if pci_dev is not searchable, isn't it already
an indicator that the device is absent then qi_submit_sync() should
just exit upon ITE?




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux