From: Long Li <longli@xxxxxxxxxxxxx> On removing the device, any work item (hv_pci_devices_present() or hv_pci_eject_device()) scheduled on workqueue hbus->wq may still be running and race with hv_pci_remove(). This can happen because the host may send PCI_EJECT or PCI_BUS_RELATIONS(2) and decide to rescind the channel immediately after that. Fix this by flushing/stopping the workqueue of hbus before doing hbus remove. Signed-off-by: Long Li <longli@xxxxxxxxxxxxx> --- Change in v2: Remove unused bus state hv_pcibus_removed drivers/pci/controller/pci-hyperv.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c index 27a17a1e4a7c..fc948a2ed703 100644 --- a/drivers/pci/controller/pci-hyperv.c +++ b/drivers/pci/controller/pci-hyperv.c @@ -444,7 +444,6 @@ enum hv_pcibus_state { hv_pcibus_probed, hv_pcibus_installed, hv_pcibus_removing, - hv_pcibus_removed, hv_pcibus_maximum }; @@ -3305,13 +3304,22 @@ static int hv_pci_remove(struct hv_device *hdev) hbus = hv_get_drvdata(hdev); if (hbus->state == hv_pcibus_installed) { + tasklet_disable(&hdev->channel->callback_event); + hbus->state = hv_pcibus_removing; + tasklet_enable(&hdev->channel->callback_event); + destroy_workqueue(hbus->wq); + /* + * At this point, no work is running or can be scheduled + * on hbus-wq. We can't race with hv_pci_devices_present() + * or hv_pci_eject_device(), it's safe to proceed. + */ + /* Remove the bus from PCI's point of view. */ pci_lock_rescan_remove(); pci_stop_root_bus(hbus->pci_bus); hv_pci_remove_slots(hbus); pci_remove_root_bus(hbus->pci_bus); pci_unlock_rescan_remove(); - hbus->state = hv_pcibus_removed; } ret = hv_pci_bus_exit(hdev, false); @@ -3326,7 +3334,6 @@ static int hv_pci_remove(struct hv_device *hdev) irq_domain_free_fwnode(hbus->sysdata.fwnode); put_hvpcibus(hbus); wait_for_completion(&hbus->remove_event); - destroy_workqueue(hbus->wq); hv_put_dom_num(hbus->sysdata.domain); -- 2.27.0