On Wed, Aug 25, 2021 at 2:11 PM Michael Kelley <mikelley@xxxxxxxxxxxxx> wrote: > > From: Long Li <longli@xxxxxxxxxxxxx> Sent: Tuesday, August 24, 2021 10:28 AM > > > > > Subject: Re: [PATCH] PCI: hv: Fix a bug on removing child devices on the bus > > > > > > On Tue, Aug 24, 2021 at 12:20:20AM -0700, longli@xxxxxxxxxxxxxxxxx wrote: > > > > From: Long Li <longli@xxxxxxxxxxxxx> > > > > > > > > In hv_pci_bus_exit, the code is holding a spinlock while calling > > > > pci_destroy_slot(), which takes a mutex. > > > > > > > > This is not safe for spinlock. Fix this by moving the children to be > > > > deleted to a list on the stack, and removing them after spinlock is > > > > released. > > > > > > > > Fixes: 94d22763207a ("PCI: hv: Fix a race condition when removing the > > > > device") > > > > > > > > Cc: "K. Y. Srinivasan" <kys@xxxxxxxxxxxxx> > > > > Cc: Haiyang Zhang <haiyangz@xxxxxxxxxxxxx> > > > > Cc: Stephen Hemminger <sthemmin@xxxxxxxxxxxxx> > > > > Cc: Wei Liu <wei.liu@xxxxxxxxxx> > > > > Cc: Dexuan Cui <decui@xxxxxxxxxxxxx> > > > > Cc: Lorenzo Pieralisi <lorenzo.pieralisi@xxxxxxx> > > > > Cc: Rob Herring <robh@xxxxxxxxxx> > > > > Cc: "Krzysztof Wilczyński" <kw@xxxxxxxxx> > > > > Cc: Bjorn Helgaas <bhelgaas@xxxxxxxxxx> > > > > Cc: Michael Kelley <mikelley@xxxxxxxxxxxxx> > > > > Cc: Dan Carpenter <dan.carpenter@xxxxxxxxxx> > > > > Reported-by: Dan Carpenter <dan.carpenter@xxxxxxxxxx> > > > > Signed-off-by: Long Li <longli@xxxxxxxxxxxxx> > > > > --- > > > > drivers/pci/controller/pci-hyperv.c | 15 ++++++++++++--- > > > > 1 file changed, 12 insertions(+), 3 deletions(-) > > > > > > > > diff --git a/drivers/pci/controller/pci-hyperv.c > > > > b/drivers/pci/controller/pci-hyperv.c > > > > index a53bd8728d0d..d4f3cce18957 100644 > > > > --- a/drivers/pci/controller/pci-hyperv.c > > > > +++ b/drivers/pci/controller/pci-hyperv.c > > > > @@ -3220,6 +3220,7 @@ static int hv_pci_bus_exit(struct hv_device *hdev, > > > bool keep_devs) > > > > struct hv_pci_dev *hpdev, *tmp; > > > > unsigned long flags; > > > > int ret; > > > > + struct list_head removed; > > > > > > This can be moved to where it is needed -- the if(!keep_dev) branch -- to limit its > > > scope. > > > > > > > > > > > /* > > > > * After the host sends the RESCIND_CHANNEL message, it doesn't @@ > > > > -3229,9 +3230,18 @@ static int hv_pci_bus_exit(struct hv_device *hdev, bool > > > keep_devs) > > > > return 0; > > > > > > > > if (!keep_devs) { > > > > - /* Delete any children which might still exist. */ > > > > + INIT_LIST_HEAD(&removed); > > > > + > > > > + /* Move all present children to the list on stack */ > > > > spin_lock_irqsave(&hbus->device_list_lock, flags); > > > > - list_for_each_entry_safe(hpdev, tmp, &hbus->children, > > > list_entry) { > > > > + list_for_each_entry_safe(hpdev, tmp, &hbus->children, > > > list_entry) > > > > + list_move_tail(&hpdev->list_entry, &removed); > > > > + spin_unlock_irqrestore(&hbus->device_list_lock, flags); > > > > + > > > > + /* Remove all children in the list */ > > > > + while (!list_empty(&removed)) { > > > > + hpdev = list_first_entry(&removed, struct hv_pci_dev, > > > > + list_entry); > > > > > > list_for_each_entry_safe can also be used here, right? > > > > > > Wei. > > > > I will address your comments. > > > > Long > > I thought list_for_each_entry_safe() is for use when list manipulation > is *not* protected by a lock and you want to safely walk the list > even if an entry gets removed. If the list is protected by a lock or > not subject to contention (as is the case here), then > list_for_each_entry() is the simpler implementation. The original > implementation didn't need to use the _safe version because of > the spin lock. > > Or do I have it backwards? "_safe" only means "safe against removal of list entry" as the kerneldoc says. But that means removal within the loop iteration, not any writer. A lock is needed in either case if there's another writer. Don't ask me about the RCU variant though... Rob