When "MC timeout" happens at GPU reset, we found the 12th and 13th bits of R_000E50_SRBM_STATUS is 1. From kernel code we found these two bits are like this: #define G_000E50_MCDX_BUSY(x) (((x) >> 12) & 1) #define G_000E50_MCDW_BUSY(x) (((x) >> 13) & 1) Could you please tell me what does they mean? And if possible, I want to know the functionalities of these 5 registers in detail: #define R_000E60_SRBM_SOFT_RESET 0x0E60 #define R_000E50_SRBM_STATUS 0x0E50 #define R_008020_GRBM_SOFT_RESET 0x8020 #define R_008010_GRBM_STATUS 0x8010 #define R_008014_GRBM_STATUS2 0x8014 A bit more info: If I reset the MC after resetting CP (this is what Linux-2.6.34 does, but removed since 2.6.35), then "MC timeout" will disappear, but there is still "ring test failed". Huacai Chen > 2011/11/8 <chenhc@xxxxxxxxxx>: >> And, I want to know something: >> 1, Does GPU use MC to access GTT? > > Yes. All GPU clients (display, 3D, etc.) go through the MC to access > memory (vram or gart). > >> 2, What can cause MC timeout? > > Lots of things. Some GPU client still active, some GPU client hung or > not properly initialized. > > Alex > >> >>> Hi, >>> >>> Some status update. >>> 在 2011年9月29日 下午5:17,Chen Jie <chenj@xxxxxxxxxx> 写道: >>>> Hi, >>>> Add more information. >>>> We got occasionally "GPU lockup" after resuming from suspend(on mipsel >>>> platform with a mips64 compatible CPU and rs780e, the kernel is >>>> 3.1.0-rc8 >>>> 64bit). Related kernel message: >>>> /* return from STR */ >>>> [ 156.152343] radeon 0000:01:05.0: WB enabled >>>> [ 156.187500] [drm] ring test succeeded in 0 usecs >>>> [ 156.187500] [drm] ib test succeeded in 0 usecs >>>> [ 156.398437] ata2: SATA link down (SStatus 0 SControl 300) >>>> [ 156.398437] ata3: SATA link down (SStatus 0 SControl 300) >>>> [ 156.398437] ata4: SATA link down (SStatus 0 SControl 300) >>>> [ 156.578125] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) >>>> [ 156.597656] ata1.00: configured for UDMA/133 >>>> [ 156.613281] usb 1-5: reset high speed USB device number 4 using >>>> ehci_hcd >>>> [ 157.027343] usb 3-2: reset low speed USB device number 2 using >>>> ohci_hcd >>>> [ 157.609375] usb 3-3: reset low speed USB device number 3 using >>>> ohci_hcd >>>> [ 157.683593] r8169 0000:02:00.0: eth0: link up >>>> [ 165.621093] PM: resume of devices complete after 9679.556 msecs >>>> [ 165.628906] Restarting tasks ... done. >>>> [ 177.085937] radeon 0000:01:05.0: GPU lockup CP stall for more than >>>> 10019msec >>>> [ 177.089843] ------------[ cut here ]------------ >>>> [ 177.097656] WARNING: at drivers/gpu/drm/radeon/radeon_fence.c:267 >>>> radeon_fence_wait+0x25c/0x33c() >>>> [ 177.105468] GPU lockup (waiting for 0x000013C3 last fence id >>>> 0x000013AD) >>>> [ 177.113281] Modules linked in: psmouse serio_raw >>>> [ 177.117187] Call Trace: >>>> [ 177.121093] [<ffffffff806f3e7c>] dump_stack+0x8/0x34 >>>> [ 177.125000] [<ffffffff8022e4f4>] warn_slowpath_common+0x78/0xa0 >>>> [ 177.132812] [<ffffffff8022e5b8>] warn_slowpath_fmt+0x38/0x44 >>>> [ 177.136718] [<ffffffff80522ed8>] radeon_fence_wait+0x25c/0x33c >>>> [ 177.144531] [<ffffffff804e9e70>] ttm_bo_wait+0x108/0x220 >>>> [ 177.148437] [<ffffffff8053b478>] >>>> radeon_gem_wait_idle_ioctl+0x80/0x114 >>>> [ 177.156250] [<ffffffff804d2fe8>] drm_ioctl+0x2e4/0x3fc >>>> [ 177.160156] [<ffffffff805a1820>] radeon_kms_compat_ioctl+0x28/0x38 >>>> [ 177.167968] [<ffffffff80311a04>] compat_sys_ioctl+0x120/0x35c >>>> [ 177.171875] [<ffffffff80211d18>] handle_sys+0x118/0x138 >>>> [ 177.179687] ---[ end trace 92f63d998efe4c6d ]--- >>>> [ 177.187500] radeon 0000:01:05.0: GPU softreset >>>> [ 177.191406] radeon 0000:01:05.0: R_008010_GRBM_STATUS=0xF57C2030 >>>> [ 177.195312] radeon 0000:01:05.0: R_008014_GRBM_STATUS2=0x00111103 >>>> [ 177.203125] radeon 0000:01:05.0: R_000E50_SRBM_STATUS=0x20023040 >>>> [ 177.363281] radeon 0000:01:05.0: Wait for MC idle timedout ! >>>> [ 177.367187] radeon 0000:01:05.0: >>>> R_008020_GRBM_SOFT_RESET=0x00007FEE >>>> [ 177.390625] radeon 0000:01:05.0: >>>> R_008020_GRBM_SOFT_RESET=0x00000001 >>>> [ 177.414062] radeon 0000:01:05.0: R_008010_GRBM_STATUS=0xA0003030 >>>> [ 177.417968] radeon 0000:01:05.0: R_008014_GRBM_STATUS2=0x00000003 >>>> [ 177.425781] radeon 0000:01:05.0: R_000E50_SRBM_STATUS=0x2002B040 >>>> [ 177.433593] radeon 0000:01:05.0: GPU reset succeed >>>> [ 177.605468] radeon 0000:01:05.0: Wait for MC idle timedout ! >>>> [ 177.761718] radeon 0000:01:05.0: Wait for MC idle timedout ! >>>> [ 177.804687] radeon 0000:01:05.0: WB enabled >>>> [ 178.000000] [drm:r600_ring_test] *ERROR* radeon: ring test failed >>>> (scratch(0x8504)=0xCAFEDEAD) >>> After pinned ring in VRAM, it warned an ib test failure. It seems >>> something wrong with accessing through GTT. >>> >>> We dump gart table just after stopped cp, and compare gart table with >>> the dumped one just after r600_pcie_gart_enable, and don't find any >>> difference. >>> >>> Any idea? >>> >>>> [ 178.007812] [drm:r600_resume] *ERROR* r600 startup failed on resume >>>> [ 178.988281] [drm:radeon_ib_schedule] *ERROR* radeon: couldn't >>>> schedule >>>> IB(5). >>>> [ 178.996093] [drm:radeon_cs_ioctl] *ERROR* Failed to schedule IB ! >>>> [ 179.003906] [drm:radeon_ib_schedule] *ERROR* radeon: couldn't >>>> schedule >>>> IB(6). >>>> ... >>> >>> >>> >>> Regards, >>> -- Chen Jie >>> >> >> >> > _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel