On Sat, Feb 23, 2013 at 4:20 PM, Bjorn Helgaas <bhelgaas@xxxxxxxxxx> wrote: > I think this is a general object lifetime issue that really has > nothing to do with ASPM except that ASPM happens to be the victim. > > You're doing this: > > echo -n 1 > /sys/bus/pci/devices/0000\:10\:00.0/remove ; echo -n 1 >> /sys/bus/pci/devices/0000\:1a\:01.0/remove > > The 1a:01.0 device is downstream from the 10:00.0 bridge. The sysfs > interface remove_store() uses device_schedule_callback() to schedule > the remove for later. I think what's happening is that we schedule > remove_callback() for both devices before 10:00.0 has been removed, > like this: > > # echo -n 1 > /sys/bus/pci/devices/0000\:10\:00.0/remove > remove_store # for 10:00.0 > device_schedule_callback(10:00.0, remove_callback) > sysfs_schedule_callback > kobject_get > queue_work > # echo -n 1 > /sys/bus/pci/devices/0000\:1a\:01.0/remove > remove_store # for 1a:01.0 > device_schedule_callback(1a:01.0, remove_callback) > sysfs_schedule_callback > kobject_get > queue_work > > Note that we acquire a reference on each pci_dev before queuing the work item. > > Later, we run the callbacks, starting with 10:00.0. This calls > remove_callback() to perform the remove: > > remove_callback(10:00.0) > mutex_lock(&pci_remove_rescan_mutex) > pci_stop_and_remove_bus_device(pdev) > mutex_unlock(&pci_remove_rescan_mutex) > > This will stop and remove the subtree below 10:00.0, but it does not > actually free the pci_dev for 1a:01.0 because we increased its ref > count in sysfs_schedule_callback. So after completing > remove_callback(10:00.0), we run the second callback for 1a:01.0. > > The remove for 1a:01.0 calls pcie_aspm_exit_link_state() from > pci_stop_dev(). This is where we blow up because, according to your > debugging, pdev->bus->self is no longer valid. > > The PCI core did this removal wrong. If we have a valid pci_dev > pointer, as we do in pcie_aspm_exit_link_state(), the whole object > ought to be valid. But the PCI core deallocated the struct pci_bus > for bus 0000:1a too soon. > > My guess is that when we build a pci_dev, we need to increase the ref > count on the pci_bus where that pci_dev lives. That way we can keep > around all the buses and bridges leading from the root to the device > in question. Agreed, Please check if attached is what you want. Thanks Yinghai
Attachment:
pci_dev_bus_ref.patch
Description: Binary data