This is a note to let you know that I've just added the patch titled PCI: hv: Fix a race condition bug in hv_pci_query_relations() to the 5.10-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: pci-hv-fix-a-race-condition-bug-in-hv_pci_query_relations.patch and it can be found in the queue-5.10 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. >From 440b5e3663271b0ffbd4908115044a6a51fb938b Mon Sep 17 00:00:00 2001 From: Dexuan Cui <decui@xxxxxxxxxxxxx> Date: Wed, 14 Jun 2023 21:44:47 -0700 Subject: PCI: hv: Fix a race condition bug in hv_pci_query_relations() From: Dexuan Cui <decui@xxxxxxxxxxxxx> commit 440b5e3663271b0ffbd4908115044a6a51fb938b upstream. Since day 1 of the driver, there has been a race between hv_pci_query_relations() and survey_child_resources(): during fast device hotplug, hv_pci_query_relations() may error out due to device-remove and the stack variable 'comp' is no longer valid; however, pci_devices_present_work() -> survey_child_resources() -> complete() may be running on another CPU and accessing the no-longer-valid 'comp'. Fix the race by flushing the workqueue before we exit from hv_pci_query_relations(). Fixes: 4daace0d8ce8 ("PCI: hv: Add paravirtual PCI front-end for Microsoft Hyper-V VMs") Signed-off-by: Dexuan Cui <decui@xxxxxxxxxxxxx> Reviewed-by: Michael Kelley <mikelley@xxxxxxxxxxxxx> Acked-by: Lorenzo Pieralisi <lpieralisi@xxxxxxxxxx> Cc: stable@xxxxxxxxxxxxxxx Link: https://lore.kernel.org/r/20230615044451.5580-2-decui@xxxxxxxxxxxxx Signed-off-by: Wei Liu <wei.liu@xxxxxxxxxx> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- drivers/pci/controller/pci-hyperv.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) --- a/drivers/pci/controller/pci-hyperv.c +++ b/drivers/pci/controller/pci-hyperv.c @@ -2912,6 +2912,24 @@ static int hv_pci_query_relations(struct if (!ret) ret = wait_for_response(hdev, &comp); + /* + * In the case of fast device addition/removal, it's possible that + * vmbus_sendpacket() or wait_for_response() returns -ENODEV but we + * already got a PCI_BUS_RELATIONS* message from the host and the + * channel callback already scheduled a work to hbus->wq, which can be + * running pci_devices_present_work() -> survey_child_resources() -> + * complete(&hbus->survey_event), even after hv_pci_query_relations() + * exits and the stack variable 'comp' is no longer valid; as a result, + * a hang or a page fault may happen when the complete() calls + * raw_spin_lock_irqsave(). Flush hbus->wq before we exit from + * hv_pci_query_relations() to avoid the issues. Note: if 'ret' is + * -ENODEV, there can't be any more work item scheduled to hbus->wq + * after the flush_workqueue(): see vmbus_onoffer_rescind() -> + * vmbus_reset_channel_cb(), vmbus_rescind_cleanup() -> + * channel->rescind = true. + */ + flush_workqueue(hbus->wq); + return ret; } Patches currently in stable-queue which might be from decui@xxxxxxxxxxxxx are queue-5.10/pci-hv-fix-a-race-condition-in-hv_irq_unmask-that-can-cause-panic.patch queue-5.10/revert-pci-hv-fix-a-timing-issue-which-causes-kdump-to-fail-occasionally.patch queue-5.10/pci-hv-remove-the-useless-hv_pcichild_state-from-struct-hv_pci_dev.patch queue-5.10/pci-hv-fix-a-race-condition-bug-in-hv_pci_query_relations.patch