Comment # 21
on bug 108521
from Robert Strube
Hi guys, Apologies for the deluge of posts here, I've been trying really hard to investigate this issue! So I took a closer look at the PCI resource issues that you mentioned, I've also been looking and thunderbolt driver issues in general, and I've noticed that this type of log message is quite common. Here's what I'm wondering: These four devices correspond to the TB to PCI bridges in the system 0000:04:00.0 0000:05:01.0 0000:05:02.0 0000:05:04.0 04:00.0 PCI bridge: Intel Corporation JHL6540 Thunderbolt 3 Bridge (C step) [Alpine Ridge 4C 2016] (rev 02) (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0, IRQ 16 Bus: primary=04, secondary=05, subordinate=6e, sec-latency=0 Memory behind bridge: bc000000-ea0fffff Prefetchable memory behind bridge: 0000002fb0000000-0000002ff9ffffff Capabilities: [80] Power Management version 3 Capabilities: [88] MSI: Enable- Count=1/1 Maskable- 64bit+ Capabilities: [ac] Subsystem: Intel Corporation JHL6540 Thunderbolt 3 Bridge (C step) [Alpine Ridge 4C 2016] Capabilities: [c0] Express Upstream Port, MSI 00 Capabilities: [100] Device Serial Number b7-de-04-b0-a6-c9-a0-00 Capabilities: [200] Advanced Error Reporting Capabilities: [300] Virtual Channel Capabilities: [400] Power Budgeting <?> Capabilities: [500] Vendor Specific Information: ID=1234 Rev=1 Len=0d8 <?> Capabilities: [600] Latency Tolerance Reporting Capabilities: [700] #19 Kernel driver in use: pcieport 05:00.0 PCI bridge: Intel Corporation JHL6540 Thunderbolt 3 Bridge (C step) [Alpine Ridge 4C 2016] (rev 02) (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0, IRQ 16 Bus: primary=05, secondary=06, subordinate=06, sec-latency=0 Memory behind bridge: ea000000-ea0fffff Capabilities: [80] Power Management version 3 Capabilities: [88] MSI: Enable- Count=1/1 Maskable- 64bit+ Capabilities: [ac] Subsystem: Intel Corporation JHL6540 Thunderbolt 3 Bridge (C step) [Alpine Ridge 4C 2016] Capabilities: [c0] Express Downstream Port (Slot+), MSI 00 Capabilities: [100] Device Serial Number b7-de-04-b0-a6-c9-a0-00 Capabilities: [200] Advanced Error Reporting Capabilities: [300] Virtual Channel Capabilities: [400] Power Budgeting <?> Capabilities: [500] Vendor Specific Information: ID=1234 Rev=1 Len=0d8 <?> Capabilities: [700] #19 Kernel driver in use: pcieport 05:01.0 PCI bridge: Intel Corporation JHL6540 Thunderbolt 3 Bridge (C step) [Alpine Ridge 4C 2016] (rev 02) (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0, IRQ 17 Bus: primary=05, secondary=07, subordinate=39, sec-latency=0 Memory behind bridge: bc000000-d3efffff Prefetchable memory behind bridge: 0000002fb0000000-0000002fcfffffff Capabilities: [80] Power Management version 3 Capabilities: [88] MSI: Enable- Count=1/1 Maskable- 64bit+ Capabilities: [ac] Subsystem: Intel Corporation JHL6540 Thunderbolt 3 Bridge (C step) [Alpine Ridge 4C 2016] Capabilities: [c0] Express Downstream Port (Slot+), MSI 00 Capabilities: [100] Device Serial Number b7-de-04-b0-a6-c9-a0-00 Capabilities: [200] Advanced Error Reporting Capabilities: [300] Virtual Channel Capabilities: [400] Power Budgeting <?> Capabilities: [500] Vendor Specific Information: ID=1234 Rev=1 Len=0d8 <?> Capabilities: [700] #19 Kernel driver in use: pcieport 05:02.0 PCI bridge: Intel Corporation JHL6540 Thunderbolt 3 Bridge (C step) [Alpine Ridge 4C 2016] (rev 02) (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0, IRQ 18 Bus: primary=05, secondary=3a, subordinate=3a, sec-latency=0 Memory behind bridge: d3f00000-d3ffffff Capabilities: [80] Power Management version 3 Capabilities: [88] MSI: Enable- Count=1/1 Maskable- 64bit+ Capabilities: [ac] Subsystem: Intel Corporation JHL6540 Thunderbolt 3 Bridge (C step) [Alpine Ridge 4C 2016] Capabilities: [c0] Express Downstream Port (Slot+), MSI 00 Capabilities: [100] Device Serial Number b7-de-04-b0-a6-c9-a0-00 Capabilities: [200] Advanced Error Reporting Capabilities: [300] Virtual Channel Capabilities: [400] Power Budgeting <?> Capabilities: [500] Vendor Specific Information: ID=1234 Rev=1 Len=0d8 <?> Capabilities: [700] #19 Kernel driver in use: pcieport 05:04.0 PCI bridge: Intel Corporation JHL6540 Thunderbolt 3 Bridge (C step) [Alpine Ridge 4C 2016] (rev 02) (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0, IRQ 16 Bus: primary=05, secondary=3b, subordinate=6e, sec-latency=0 Memory behind bridge: d4000000-e9ffffff Prefetchable memory behind bridge: 0000002fd0000000-0000002ff9ffffff Capabilities: [80] Power Management version 3 Capabilities: [88] MSI: Enable- Count=1/1 Maskable- 64bit+ Capabilities: [ac] Subsystem: Intel Corporation JHL6540 Thunderbolt 3 Bridge (C step) [Alpine Ridge 4C 2016] Capabilities: [c0] Express Downstream Port (Slot+), MSI 00 Capabilities: [100] Device Serial Number b7-de-04-b0-a6-c9-a0-00 Capabilities: [200] Advanced Error Reporting Capabilities: [300] Virtual Channel Capabilities: [400] Power Budgeting <?> Capabilities: [500] Vendor Specific Information: ID=1234 Rev=1 Len=0d8 <?> Capabilities: [700] #19 Kernel driver in use: pcieport First you see pci defining the bridge windows for devices: [ 104.290143] pci 0000:05:01.0: bridge window [io 0x1000-0x0fff] to [bus 07-39] add_size 1000 [ 104.290152] pci 0000:05:02.0: bridge window [io 0x1000-0x0fff] to [bus 3a] add_size 1000 [ 104.290155] pci 0000:05:02.0: bridge window [mem 0x00100000-0x000fffff 64bit pref] to [bus 3a] add_size 200000 add_align 100000 [ 104.290169] pci 0000:05:04.0: bridge window [io 0x1000-0x0fff] to [bus 3b-6e] add_size 1000 [ 104.290180] pci 0000:04:00.0: bridge window [io 0x1000-0x0fff] to [bus 05-6e] add_size 3000 Then you see a bunch of BAR errors, saying there's no space and that they can't be assigned: [ 104.290184] pci 0000:04:00.0: BAR 13: no space for [io size 0x3000] [ 104.290185] pci 0000:04:00.0: BAR 13: failed to assign [io size 0x3000] [ 104.290187] pci 0000:04:00.0: BAR 13: no space for [io size 0x3000] [ 104.290188] pci 0000:04:00.0: BAR 13: failed to assign [io size 0x3000] [ 104.290193] pci 0000:05:02.0: BAR 15: no space for [mem size 0x00200000 64bit pref] [ 104.290194] pci 0000:05:02.0: BAR 15: failed to assign [mem size 0x00200000 64bit pref] [ 104.290196] pci 0000:05:01.0: BAR 13: no space for [io size 0x1000] [ 104.290197] pci 0000:05:01.0: BAR 13: failed to assign [io size 0x1000] [ 104.290198] pci 0000:05:02.0: BAR 13: no space for [io size 0x1000] [ 104.290199] pci 0000:05:02.0: BAR 13: failed to assign [io size 0x1000] [ 104.290201] pci 0000:05:04.0: BAR 13: no space for [io size 0x1000] [ 104.290202] pci 0000:05:04.0: BAR 13: failed to assign [io size 0x1000] [ 104.290203] pci 0000:05:04.0: BAR 13: no space for [io size 0x1000] [ 104.290205] pci 0000:05:04.0: BAR 13: failed to assign [io size 0x1000] [ 104.290207] pci 0000:05:02.0: BAR 15: no space for [mem size 0x00200000 64bit pref] [ 104.290208] pci 0000:05:02.0: BAR 15: failed to assign [mem size 0x00200000 64bit pref] [ 104.290209] pci 0000:05:02.0: BAR 13: no space for [io size 0x1000] [ 104.290210] pci 0000:05:02.0: BAR 13: failed to assign [io size 0x1000] [ 104.290212] pci 0000:05:01.0: BAR 13: no space for [io size 0x1000] [ 104.290213] pci 0000:05:01.0: BAR 13: failed to assign [io size 0x1000] But then you see that the PCI bridges seem to initialize for all the devices: [ 104.290215] pci 0000:05:00.0: PCI bridge to [bus 06] [ 104.290221] pci 0000:05:00.0: bridge window [mem 0xea000000-0xea0fffff] [ 104.290231] pci 0000:05:01.0: PCI bridge to [bus 07-39] [ 104.290237] pci 0000:05:01.0: bridge window [mem 0xbc000000-0xd3efffff] [ 104.290241] pci 0000:05:01.0: bridge window [mem 0x2fb0000000-0x2fcfffffff 64bit pref] [ 104.290248] pci 0000:05:02.0: PCI bridge to [bus 3a] [ 104.290254] pci 0000:05:02.0: bridge window [mem 0xd3f00000-0xd3ffffff] [ 104.290264] pci 0000:05:04.0: PCI bridge to [bus 3b-6e] [ 104.290270] pci 0000:05:04.0: bridge window [mem 0xd4000000-0xe9ffffff] [ 104.290274] pci 0000:05:04.0: bridge window [mem 0x2fd0000000-0x2ff9ffffff 64bit pref] [ 104.290281] pci 0000:04:00.0: PCI bridge to [bus 05-6e] [ 104.290286] pci 0000:04:00.0: bridge window [mem 0xbc000000-0xea0fffff] [ 104.290291] pci 0000:04:00.0: bridge window [mem 0x2fb0000000-0x2ff9ffffff 64bit pref] Perhaps the BAR errors are just a red herring and at the end of the process all of the the Thunderbolt PCI bridges *are* initialized correctly? As I said, I've probably spent way too much time looking at this, the main thing I keep coming back to is that my other GPU *does* work correctly as an eGPU. It's also a PCI x16 card (I know it's operating over PCI x4 due to TB3 bandwitch limitations), so theoretically if there were any PCI resource problems with the Thunderbolt bridge then this GPU should also fail, correct? I noticed a couple other things in my research: I found a bug that points to tlp (specifically power management) as causing the same problems with the atom bios being stuck in a loop: https://bugs.freedesktop.org/show_bug.cgi?id=103783 Perhaps the issue is caused by some sort of aggressive PM? I might try adding some kernel boot parameters amdgpu.dpm=0 amdgpu.apm=0 etc. I was also thinking that perhaps I should try the AMDGPU-PRO drivers just to see if they would work by chance. Somebody else reported that these drivers worked, while the amdgpu drivers failed. It's worth a shot. Thanks for any feedback and/or advice! Rob
You are receiving this mail because:
- You are the assignee for the bug.
_______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel