Re: Commit ef83b0781a73f (PCI: Remove from bus_list and release resources in pci_release_dev()) broke TBT hotplug

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Friday, January 31, 2014 02:49:21 PM Rafael J. Wysocki wrote:
> On Friday, January 31, 2014 02:36:07 PM Mika Westerberg wrote:
> > On Fri, Jan 31, 2014 at 12:52:43PM +0100, Rafael J. Wysocki wrote:
> > > So I think what happens is that we leak the struct pci_dev during removal and
> > > the proper cleanup is never done.
> > > 
> > > Can you please add a debug printk into pci_release_dev() and see if that's
> > > ever called after TBT unplug?
> > 
> > OK, I added the debug print (still on top of your two patches) and was able
> > to capture a bit more from /var/log/messages before it crashes. Here's the
> > log. I added dev_info(dev, "RELEASE\n") to pci_release_dev().
> > 
> > Unplug:
> > 
> > Jan 31 20:05:57 buildroot kern.debug kernel: [  439.557920] pcieport 0000:06:03.0: PME# disabled
> > Jan 31 20:05:57 buildroot kern.debug kernel: [  439.559483] pcieport 0000:05:00.0: PME# disabled
> > Jan 31 20:05:57 buildroot kern.info kernel: [  439.561074] pci 0000:07:00.0: RELEASE
> > Jan 31 20:05:57 buildroot kern.debug kernel: [  439.562536] pci_bus 0000:07: busn_res: [bus 07] is released
> > Jan 31 20:05:57 buildroot kern.info kernel: [  439.563993] pci 0000:06:03.0: RELEASE
> > Jan 31 20:05:57 buildroot kern.info kernel: [  439.570345] pci 0000:0a:00.0: RELEASE
> > Jan 31 20:05:57 buildroot kern.debug kernel: [  439.571734] pci_bus 0000:0a: busn_res: [bus 0a] is released
> > Jan 31 20:05:57 buildroot kern.info kernel: [  439.573154] pci 0000:09:00.0: RELEASE
> > Jan 31 20:05:57 buildroot kern.debug kernel: [  439.574528] pci_bus 0000:09: busn_res: [bus 09-2e] is released
> > Jan 31 20:05:57 buildroot kern.info kernel: [  439.575939] pci 0000:08:00.0: RELEASE
> > Jan 31 20:05:57 buildroot kern.debug kernel: [  439.577316] pci_bus 0000:08: busn_res: [bus 08-2e] is released
> > Jan 31 20:05:57 buildroot kern.info kernel: [  439.578721] pci 0000:06:04.0: RELEASE
> > Jan 31 20:05:57 buildroot kern.debug kernel: [  439.580081] pci_bus 0000:2f: busn_res: [bus 2f] is released
> > Jan 31 20:05:57 buildroot kern.info kernel: [  439.581487] pci 0000:06:05.0: RELEASE
> > Jan 31 20:05:57 buildroot kern.debug kernel: [  439.582873] pci_bus 0000:06: busn_res: [bus 06-2f] is released
> > Jan 31 20:05:57 buildroot kern.info kernel: [  439.584322] pci 0000:05:00.0: RELEASE
> > Jan 31 20:05:57 buildroot kern.debug kernel: [  439.585727] pcieport 0000:03:00.0: PME# disabled
> > Jan 31 20:05:57 buildroot kern.debug kernel: [  439.587225] pci_bus 0000:04: busn_res: [bus 04] is released
> > Jan 31 20:05:57 buildroot kern.info kernel: [  439.588723] pci 0000:03:00.0: RELEASE
> > Jan 31 20:05:57 buildroot kern.debug kernel: [  439.660389] pci_bus 0000:05: busn_res: [bus 05-2f] is released
> > Jan 31 20:05:57 buildroot kern.info kernel: [  439.661993] pci 0000:03:03.0: RELEASE
> > Jan 31 20:05:57 buildroot kern.debug kernel: [  439.663527] pci_bus 0000:30: busn_res: [bus 30-38] is released
> > Jan 31 20:05:57 buildroot kern.info kernel: [  439.665103] pci 0000:03:04.0: RELEASE
> > Jan 31 20:05:57 buildroot kern.debug kernel: [  439.666641] pci_bus 0000:39: busn_res: [bus 39] is released
> > Jan 31 20:05:57 buildroot kern.info kernel: [  439.668210] pci 0000:03:05.0: RELEASE
> > Jan 31 20:05:57 buildroot kern.debug kernel: [  439.669764] pci_bus 0000:3a: busn_res: [bus 3a] is released
> > Jan 31 20:05:57 buildroot kern.info kernel: [  439.671350] pci 0000:03:06.0: RELEASE
> > Jan 31 20:05:57 buildroot kern.debug kernel: [  439.672933] pci_bus 0000:03: busn_res: [bus 03-3a] is released
> 
> OK, so my guess wasn't right.  We seem to call pci_release_dev for all of the
> devices that go away after unplug.
> 
> Do I think correctly that the below doesn't happen with the Yinghai's commit
> reverted?
> 
> > Plug:
> > 
> > Jan 31 20:06:11 buildroot kern.debug kernel: [  453.609684] acpiphp_glue: hotplug_event: Bus check notify on \_SB_.PCI0.RP05
> > Jan 31 20:06:11 buildroot kern.debug kernel: [  453.611339] acpiphp_glue: hotplug_event: re-enumerating slots under \_SB_.PCI0.RP05
> > Jan 31 20:06:11 buildroot kern.debug kernel: [  453.614625] pci 0000:02:00.0: scanning [bus 03-3a] behind bridge, pass 0
> > Jan 31 20:06:11 buildroot kern.warn kernel: [  453.616434] ------------[ cut here ]------------
> > Jan 31 20:06:11 buildroot kern.warn kernel: [  453.618102] WARNING: CPU: 1 PID: 956 at lib/kobject.c:244 kobject_add_internal+0x12d/0x400()
> > Jan 31 20:06:11 buildroot kern.warn kernel: [  453.619797] kobject_add_internal failed for pci_bus (error: -2 parent: 0000:02:00.0)
> 
> create_dir() fails here and that's not because it already exists.
> Interesting.

That's more interesting than I thought.

So the error is -2, which is -ENOENT.  Let's see when create_dir() returns -ENOENT,
then.

Evidently, it calls sysfs_create_dir_ns() and returns the error code returned by
that, but if it is 0, it returns the return value of populate_dir().

sysfs_create_dir_ns() tries to use kobj->parent->sd and returns -ENOENT when
that is NULL.  There you go.  So it looks like the sysfs dir of PCI device
0000:02:00.0 doesn't exist at this point.

Yinghai, any ideas?

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux