[Bug 59649] New: [r600][RV635] GPU lockup CP stall / GPU resets over and over - Kernel 3.7, 3.8-rcX

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Priority medium
Bug ID 59649
Assignee dri-devel@lists.freedesktop.org
Summary [r600][RV635] GPU lockup CP stall / GPU resets over and over - Kernel 3.7, 3.8-rcX
Severity major
Classification Unclassified
OS Linux (All)
Reporter shawn.starr@rogers.com
Hardware x86-64 (AMD64)
Status NEW
Version 9.0
Component Drivers/Gallium/r600
Product Mesa

Using Linux kernel 3.7 and up to 3.8-rc3 Unable to have a stable session with
my RV635 GPU

Jan 19 03:45:26 segfault kernel: [15008.313696] radeon 0000:01:00.0: Saved 185
dwords of commands on ring 0.
Jan 19 03:45:26 segfault kernel: [15008.313704] radeon 0000:01:00.0: GPU
softreset
Jan 19 03:45:26 segfault kernel: [15008.313711] radeon 0000:01:00.0:  
R_008010_GRBM_STATUS=0xA0003030
Jan 19 03:45:26 segfault kernel: [15008.313717] radeon 0000:01:00.0:  
R_008014_GRBM_STATUS2=0x00000003
Jan 19 03:45:26 segfault kernel: [15008.313723] radeon 0000:01:00.0:  
R_000E50_SRBM_STATUS=0x200000C0
Jan 19 03:45:26 segfault kernel: [15008.313730] radeon 0000:01:00.0:  
R_008674_CP_STALLED_STAT1 = 0x00000000
Jan 19 03:45:26 segfault kernel: [15008.313736] radeon 0000:01:00.0:  
R_008678_CP_STALLED_STAT2 = 0x00000000
Jan 19 03:45:26 segfault kernel: [15008.313742] radeon 0000:01:00.0:  
R_00867C_CP_BUSY_STAT     = 0x00000006
Jan 19 03:45:26 segfault kernel: [15008.313748] radeon 0000:01:00.0:  
R_008680_CP_STAT          = 0x80000645
Jan 19 03:45:26 segfault kernel: [15008.313761] radeon 0000:01:00.0:  
R_008020_GRBM_SOFT_RESET=0x00007FEE
Jan 19 03:45:26 segfault kernel: [15008.328772] radeon 0000:01:00.0:
R_008020_GRBM_SOFT_RESET=0x00000001
Jan 19 03:45:26 segfault kernel: [15008.344782] radeon 0000:01:00.0:  
R_008010_GRBM_STATUS=0xA0003030
Jan 19 03:45:26 segfault kernel: [15008.344785] radeon 0000:01:00.0:  
R_008014_GRBM_STATUS2=0x00000003
Jan 19 03:45:26 segfault kernel: [15008.344787] radeon 0000:01:00.0:  
R_000E50_SRBM_STATUS=0x200080C0
Jan 19 03:45:26 segfault kernel: [15008.344789] radeon 0000:01:00.0:  
R_008674_CP_STALLED_STAT1 = 0x00000000
Jan 19 03:45:26 segfault kernel: [15008.344792] radeon 0000:01:00.0:  
R_008678_CP_STALLED_STAT2 = 0x00000000
Jan 19 03:45:26 segfault kernel: [15008.344794] radeon 0000:01:00.0:  
R_00867C_CP_BUSY_STAT     = 0x00000000
Jan 19 03:45:26 segfault kernel: [15008.344797] radeon 0000:01:00.0:  
R_008680_CP_STAT          = 0x80100000
Jan 19 03:45:26 segfault kernel: [15008.345799] radeon 0000:01:00.0: GPU reset
succeeded, trying to resume
Jan 19 03:45:26 segfault kernel: [15008.348414] [drm] probing gen 2 caps for
device 8086:2a41 = 1/0
Jan 19 03:45:26 segfault kernel: [15008.350360] [drm] PCIE GART of 512M enabled
(table at 0x0000000000040000).
Jan 19 03:45:26 segfault kernel: [15008.350399] radeon 0000:01:00.0: WB enabled
Jan 19 03:45:26 segfault kernel: [15008.350403] radeon 0000:01:00.0: fence
driver on ring 0 use gpu addr 0x0000000020000c00 and cpu addr
0xffff880229236c00
Jan 19 03:45:26 segfault kernel: [15008.381778] [drm] ring test on 0 succeeded
in 1 usecs
Jan 19 03:45:26 segfault kernel: [15008.384549] [drm] ib test on ring 0
succeeded in 0 usecs
Jan 19 03:46:12 segfault kernel: [15053.625108] radeon 0000:01:00.0: GPU lockup
CP stall for more than 10000msec

...

Jan 19 03:46:12 segfault kernel: [15053.975428] radeon 0000:01:00.0: Wait for
MC idle timedout !
Jan 19 03:46:12 segfault kernel: [15054.123890] radeon 0000:01:00.0: Wait for
MC idle timedout !
Jan 19 03:46:12 segfault kernel: [15054.125748] [drm] PCIE GART of 512M enabled
(table at 0x0000000000040000).
Jan 19 03:46:12 segfault kernel: [15054.125785] radeon 0000:01:00.0: WB enabled
Jan 19 03:46:12 segfault kernel: [15054.125789] radeon 0000:01:00.0: fence
driver on ring 0 use gpu addr 0x0000000020000c00 and cpu addr
0xffff880229236c00
Jan 19 03:46:12 segfault kernel: [15054.157608] [drm] ring test on 0 succeeded
in 0 usecs
Jan 19 03:46:23 segfault kernel: [15064.657103] radeon 0000:01:00.0: GPU lockup
CP stall for more than 10000msec
Jan 19 03:46:23 segfault kernel: [15064.657114] radeon 0000:01:00.0: GPU lockup
(waiting for 0x00000000000441b6 last fence id 0x00000000000441a8)
Jan 19 03:46:23 segfault kernel: [15064.657121] [drm:r600_ib_test] *ERROR*
radeon: fence wait failed (-35).
Jan 19 03:46:23 segfault kernel: [15064.657134] [drm:radeon_ib_ring_tests]
*ERROR* radeon: failed testing IB on GFX ring (-35).
Jan 19 03:46:23 segfault kernel: [15064.657140] radeon 0000:01:00.0: ib ring
test failed (-35).
Jan 19 03:46:23 segfault kernel: [15064.658211] radeon 0000:01:00.0: GPU
softreset
Jan 19 03:46:23 segfault kernel: [15064.658218] radeon 0000:01:00.0:  
R_008010_GRBM_STATUS=0xE57C24E0
Jan 19 03:46:23 segfault kernel: [15064.658224] radeon 0000:01:00.0:  
R_008014_GRBM_STATUS2=0x00113303
Jan 19 03:46:23 segfault kernel: [15064.658230] radeon 0000:01:00.0:  
R_000E50_SRBM_STATUS=0x200030C0
Jan 19 03:46:23 segfault kernel: [15064.658236] radeon 0000:01:00.0:  
R_008674_CP_STALLED_STAT1 = 0x01000000
Jan 19 03:46:23 segfault kernel: [15064.658242] radeon 0000:01:00.0:  
R_008678_CP_STALLED_STAT2 = 0x00001002
Jan 19 03:46:23 segfault kernel: [15064.658248] radeon 0000:01:00.0:  
R_00867C_CP_BUSY_STAT     = 0x00028482
Jan 19 03:46:23 segfault kernel: [15064.658254] radeon 0000:01:00.0:  
R_008680_CP_STAT          = 0x80838645
Jan 19 03:46:23 segfault kernel: [15064.829116] radeon 0000:01:00.0: Wait for
MC idle timedout !
Jan 19 03:46:23 segfault kernel: [15064.829123] radeon 0000:01:00.0:  
R_008020_GRBM_SOFT_RESET=0x00007FEE
Jan 19 03:46:23 segfault kernel: [15064.844133] radeon 0000:01:00.0:
R_008020_GRBM_SOFT_RESET=0x00000001
Jan 19 03:46:23 segfault kernel: [15064.860144] radeon 0000:01:00.0:  
R_008010_GRBM_STATUS=0xA0003030
Jan 19 03:46:23 segfault kernel: [15064.860150] radeon 0000:01:00.0:  
R_008014_GRBM_STATUS2=0x00000003
an 19 03:46:23 segfault kernel: [15064.860163] radeon 0000:01:00.0:  
R_000E50_SRBM_STATUS=0x2000B0C0
Jan 19 03:46:23 segfault kernel: [15064.860169] radeon 0000:01:00.0:  
R_008674_CP_STALLED_STAT1 = 0x00000000
Jan 19 03:46:23 segfault kernel: [15064.860175] radeon 0000:01:00.0:  
R_008678_CP_STALLED_STAT2 = 0x00000000
Jan 19 03:46:23 segfault kernel: [15064.860181] radeon 0000:01:00.0:  
R_00867C_CP_BUSY_STAT     = 0x00000000
Jan 19 03:46:23 segfault kernel: [15064.860191] radeon 0000:01:00.0:  
R_008680_CP_STAT          = 0x80100000
Jan 19 03:46:23 segfault kernel: [15064.861197] radeon 0000:01:00.0: GPU reset
succeeded, trying to resume

Jan 19 04:39:23 segfault kernel: [ 2791.671107] [drm:r600_ib_test] *ERROR*
radeon: fence wait failed (-35).

Jan 19 04:39:23 segfault kernel: [ 2791.671115] [drm:radeon_ib_ring_tests]
*ERROR* radeon: failed testing IB on GFX ring (-35).

Then floods console with

[drm:radeon_cs_ib_chunk] *ERROR* Failed to schedule IB !
radeon 0000:01:00.0: couldn't schedule ib (over and over)

mesa-dri-drivers-9.0.1-3.fc18.x86_64
libdrm-2.4.40-1.fc18.x86_64

kernels: kernel-3.7.3-201.fc18.x86_64,
kernel-devel-3.8.0-0.rc3.git1.2.fc19.x86_64

I have not tried on 3.8-rc4 yet

Laptop:  Lenovo ThinkPad W500


You are receiving this mail because:
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux