> From: Long Li <longli@xxxxxxxxxxxxx> > Sent: Tuesday, March 28, 2023 9:49 AM > > --- a/drivers/pci/controller/pci-hyperv.c > > +++ b/drivers/pci/controller/pci-hyperv.c > > @@ -3308,6 +3308,19 @@ static int hv_pci_query_relations(struct > hv_device > > *hdev) > > if (!ret) > > ret = wait_for_response(hdev, &comp); > > > > + /* > > + * In the case of fast device addition/removal, it's possible that > > + * vmbus_sendpacket() or wait_for_response() returns -ENODEV but > > we > > + * already got a PCI_BUS_RELATIONS* message from the host and the > > + * channel callback already scheduled a work to hbus->wq, which can > > be > > + * running survey_child_resources() -> complete(&hbus- > > >survey_event), > > + * even after hv_pci_query_relations() exits and the stack variable > > + * 'comp' is no longer valid. This can cause a strange hang issue > > + * or sometimes a page fault. Flush hbus->wq before we exit from > > + * hv_pci_query_relations() to avoid the issues. > > + */ > > + flush_workqueue(hbus->wq); > > Is it possible for PCI_BUS_RELATIONS to be scheduled arrive after calling > flush_workqueue(hbus->wq)? It's possible, but that doesn't matter: hv_pci_query_relations() is called only once, and it sets hbus->survey_event to point to the stack variable 'comp'. The first survey_child_resources() calls complete() for the 'comp' and sets hbus->survey_event to NULL. When the second survey_child_resources() is called, hbus->survey_event is NULL, so survey_child_resources() returns immediately. According to my test, after hv_pci_enter_d0() posts PCI_BUS_D0ENTRY, the guest receives a second PCI_BUS_RELATIONS2 message, which is the same as the first PCI_BUS_RELATIONS2 message, which is basically a no-op in pci_devices_present_work(), especially with the newly-introduced per-hbus state_lock mutex.