[Bug 215494] New: [radeon, rv370] Running piglit shaders@glsl-vs-raytrace-bug26691 test causes hard lockup & reboot

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



https://bugzilla.kernel.org/show_bug.cgi?id=215494

            Bug ID: 215494
           Summary: [radeon, rv370] Running piglit
                    shaders@glsl-vs-raytrace-bug26691 test causes hard
                    lockup & reboot
           Product: Drivers
           Version: 2.5
    Kernel Version: 5.16.0
          Hardware: All
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: Video(DRI - non Intel)
          Assignee: drivers_video-dri@xxxxxxxxxxxxxxxxxxxx
          Reporter: erhard_f@xxxxxxxxxxx
                CC: alexdeucher@xxxxxxxxx
        Regression: No

Created attachment 300268
  --> https://bugzilla.kernel.org/attachment.cgi?id=300268&action=edit
kernel dmesg (kernel 5.16.0, Ryzen 9 5950X)

Running the piglit festsuite (git-11ee10ba04) for
https://gitlab.freedesktop.org/mesa/mesa/-/issues/3152 via './piglit run -1
quick -l verbose -s --dmesg' on a Radeon X600 causes the X600 to hard lockup &
reboot. On my system this happens with kernel 5.15.11, 5.16.0, mesa 21.3.4 and
mesa 22 (git-8b3d947267).

I had a closer look and found out that shaders@glsl-vs-raytrace-bug26691 causes
the lockup. Running "./piglit/bin/glsl-vs-raytrace-bug26691 -auto -fbo" as a
single test works sometimes the 1st time, but re-running it a 2nd or a 3rd time
always causes the lockup:

[...]
[  518.794824] radeon: wait for empty RBBM fifo failed! Bad things might
happen.
[  519.110152] Failed to wait GUI idle while programming pipes. Bad things
might happen.
[  519.111220] radeon 0000:07:00.0: Saved 59 dwords of commands on ring 0.
[  519.111247] radeon 0000:07:00.0: (r300_asic_reset:426)
RBBM_STATUS=0x8411C100
[  519.616733] radeon 0000:07:00.0: (r300_asic_reset:445)
RBBM_STATUS=0x8401C100
[  520.118160] radeon 0000:07:00.0: (r300_asic_reset:457)
RBBM_STATUS=0x8400C100
[  520.118231] radeon 0000:07:00.0: failed to reset GPU
[  520.319694] pcieport 0000:00:03.1: AER: Corrected error received:
0000:00:03.1
[  520.319723] pcieport 0000:00:03.1: PCIe Bus Error: severity=Corrected,
type=Transaction Layer, (Receiver ID)
[  520.319729] pcieport 0000:00:03.1:   device [1022:1483] error
status/mask=00002000/00004000
[  520.319735] pcieport 0000:00:03.1:    [13] NonFatalErr           
[  520.722345] pcieport 0000:00:03.1: AER: Corrected error received:
0000:00:03.1


For regular desktop usage the X600 seems ok so far. Some data about the system:

 $ inxi -b
System:
  Host: prototype Kernel: 5.16.0-Zen3 x86_64 bits: 64 Desktop: Openbox 3.6.1 
  Distro: Gentoo Base System release 2.7 
Machine:
  Type: Desktop Mobo: ASRock model: B450M Steel Legend 
  serial: <superuser/root required> UEFI: American Megatrends v: P4.20 
  date: 08/03/2021 
CPU:
  Info: 16-Core AMD Ryzen 9 5950X [MT MCP] speed: 3685 MHz 
  min/max: 2200/3400 MHz 
Graphics:
  Device-1: AMD RV370 [Radeon X600/X600 SE] driver: radeon v: kernel 
  Display: x11 server: X.Org 1.20.14 driver: ati,radeon 
  unloaded: fbdev,modesetting resolution: 1920x1080~60Hz 
  OpenGL: renderer: ATI RV370 v: 2.1 Mesa 22.0.0-devel (git-8b3d947267) 
Network:
  Device-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet 
  driver: r8169 

 # lspci 
00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Root
Complex
00:00.2 IOMMU: Advanced Micro Devices, Inc. [AMD] Starship/Matisse IOMMU
00:01.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe
Dummy Host Bridge
00:01.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP
Bridge
00:01.3 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP
Bridge
00:02.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe
Dummy Host Bridge
00:03.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe
Dummy Host Bridge
00:03.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP
Bridge
00:04.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe
Dummy Host Bridge
00:05.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe
Dummy Host Bridge
00:07.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe
Dummy Host Bridge
00:07.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse
Internal PCIe GPP Bridge 0 to bus[E:B]
00:08.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe
Dummy Host Bridge
00:08.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse
Internal PCIe GPP Bridge 0 to bus[E:B]
00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller (rev 61)
00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge (rev 51)
00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data
Fabric: Device 18h; Function 0
00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data
Fabric: Device 18h; Function 1
00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data
Fabric: Device 18h; Function 2
00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data
Fabric: Device 18h; Function 3
00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data
Fabric: Device 18h; Function 4
00:18.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data
Fabric: Device 18h; Function 5
00:18.6 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data
Fabric: Device 18h; Function 6
00:18.7 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse/Vermeer Data
Fabric: Device 18h; Function 7
01:00.0 Non-Volatile memory controller: Sandisk Corp WD Blue SN550 NVMe SSD
(rev 01)
02:00.0 USB controller: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset
USB 3.1 XHCI Controller (rev 01)
02:00.1 SATA controller: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset
SATA Controller (rev 01)
02:00.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe
Bridge (rev 01)
03:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe
Port (rev 01)
03:01.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe
Port (rev 01)
03:04.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe
Port (rev 01)
05:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411
PCI Express Gigabit Ethernet Controller (rev 15)
07:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] RV370
[Radeon X600/X600 SE]
07:00.1 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] RV380
[Radeon X300/X550/X1050 Series] (Secondary)
08:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc.
[AMD] Starship/Matisse PCIe Dummy Function
09:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc.
[AMD] Starship/Matisse Reserved SPP
09:00.1 Encryption controller: Advanced Micro Devices, Inc. [AMD]
Starship/Matisse Cryptographic Coprocessor PSPCPP
09:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Matisse USB 3.0 Host
Controller

 # lspci -s 07:00.0 -vv
07:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] RV370
[Radeon X600/X600 SE] (prog-if 00 [VGA controller])
        Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] RV370 [Radeon
X600/X600 SE]
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
<MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 59
        IOMMU group: 2
        Region 0: Memory at e8000000 (64-bit, prefetchable) [size=128M]
        Region 2: Memory at fce30000 (64-bit, non-prefetchable) [size=64K]
        Region 4: I/O ports at e000 [size=256]
        Expansion ROM at 000c0000 [disabled] [size=128K]
        Capabilities: [50] Power Management version 2
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA
PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [58] Express (v1) Endpoint, MSI 00
                DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <256ns,
L1 <4us
                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE- FLReset-
SlotPowerLimit 75.000W
                DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
                        RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
                        MaxPayload 128 bytes, MaxReadReq 128 bytes
                DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr-
TransPend-
                LnkCap: Port #0, Speed 2.5GT/s, Width x16, ASPM L0s L1, Exit
Latency L0s <256ns, L1 <2us
                        ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
                LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 2.5GT/s (ok), Width x16 (ok)
                        TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
        Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit+
                Address: 00000000fee01000  Data: 0022
        Capabilities: [100 v1] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt-
RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
AdvNonFatalErr-
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
AdvNonFatalErr-
                AERCap: First Error Pointer: 00, ECRCGenCap- ECRCGenEn-
ECRCChkCap- ECRCChkEn-
                        MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
                HeaderLog: 04000001 0000200f 07070000 b8cdf5fd
        Kernel driver in use: radeon
        Kernel modules: radeon

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.



[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux