[Bug 218795] USB4 / Thunderbolt + AMD: unstable and slow link (many uncorrectable errors)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



https://bugzilla.kernel.org/show_bug.cgi?id=218795

--- Comment #2 from Eduard Kachur (glite60@xxxxxxxxx) ---
Created attachment 307145
  --> https://bugzilla.kernel.org/attachment.cgi?id=307145&action=edit
trace log with thunderbolt and pci

I have similar case with eGPU and VFIO passtrough into Windows VM, which
crashes.

Laptop specs
HP Zbook Firefly G10 A 
Ryzen 7 7840 HS
Wikingoo Q1L box with JHL6340, also bought and tried Wikingoo P1-60W-M with
JHL7440 told by manufacturer, but lspci names it JHL7540.
Nvidia Quadro P1000
Ubuntu 24.10 Kernel 6.11

System gives lots of:
[ 6323.581954] pcieport 0000:00:04.1: AER: Correctable error message received
from 0000:64:01.0
[ 6323.581966] pcieport 0000:64:01.0: PCIe Bus Error: severity=Correctable,
type=Data Link Layer, (Receiver ID)
[ 6323.581969] pcieport 0000:64:01.0:   device [8086:15da] error
status/mask=00000080/00002000
[ 6323.581973] pcieport 0000:64:01.0:    [ 7] BadDLLP     

And eventually crashes VM with:
[ 6360.466620] pcieport 0000:00:04.1: AER: Multiple Uncorrectable (Non-Fatal)
error message received from 0000:65:00.0
[ 6360.466648] vfio-pci 0000:65:00.0: PCIe Bus Error: severity=Uncorrectable
(Non-Fatal), type=Transaction Layer, (Requester ID)
[ 6360.466652] vfio-pci 0000:65:00.0:   device [10de:1cb1] error
status/mask=00004000/00000000
[ 6360.466655] vfio-pci 0000:65:00.0:    [14] CmpltTO                (First)

Box with newer JHL7440 doesn't have so many BadDLLP errors, but also crashes
with  CmpltTO.
Without passtrough and Nvidia driver on host system there are still lots of
BadDLLP errors, but I haven't seen a crash.

I tried pcie_aspm=off with those boxes, but they are not initialized in that
case with hotplug and in coldboot case, Intel based system has same behaviour.
pcie_aspm=force causes some additional errors on PCIe bus.

Possible workaround for me to get a stable system with passtrough is to use
pci=nommconf, but this causes graphical glitches on host GPU in 3D rendering
case.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.




[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux