On Tue, Nov 08, 2011 at 03:33:03PM +0800, Chen Jie wrote: > Hi, > > Some status update. > 在 2011年9月29日 下午5:17,Chen Jie <chenj@xxxxxxxxxx> 写道: > > Hi, > > Add more information. > > We got occasionally "GPU lockup" after resuming from suspend(on mipsel > > platform with a mips64 compatible CPU and rs780e, the kernel is 3.1.0-rc8 > > 64bit). Related kernel message: > > /* return from STR */ > > [ 156.152343] radeon 0000:01:05.0: WB enabled > > [ 156.187500] [drm] ring test succeeded in 0 usecs > > [ 156.187500] [drm] ib test succeeded in 0 usecs > > [ 156.398437] ata2: SATA link down (SStatus 0 SControl 300) > > [ 156.398437] ata3: SATA link down (SStatus 0 SControl 300) > > [ 156.398437] ata4: SATA link down (SStatus 0 SControl 300) > > [ 156.578125] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) > > [ 156.597656] ata1.00: configured for UDMA/133 > > [ 156.613281] usb 1-5: reset high speed USB device number 4 using ehci_hcd > > [ 157.027343] usb 3-2: reset low speed USB device number 2 using ohci_hcd > > [ 157.609375] usb 3-3: reset low speed USB device number 3 using ohci_hcd > > [ 157.683593] r8169 0000:02:00.0: eth0: link up > > [ 165.621093] PM: resume of devices complete after 9679.556 msecs > > [ 165.628906] Restarting tasks ... done. > > [ 177.085937] radeon 0000:01:05.0: GPU lockup CP stall for more than > > 10019msec > > [ 177.089843] ------------[ cut here ]------------ > > [ 177.097656] WARNING: at drivers/gpu/drm/radeon/radeon_fence.c:267 > > radeon_fence_wait+0x25c/0x33c() > > [ 177.105468] GPU lockup (waiting for 0x000013C3 last fence id 0x000013AD) > > [ 177.113281] Modules linked in: psmouse serio_raw > > [ 177.117187] Call Trace: > > [ 177.121093] [<ffffffff806f3e7c>] dump_stack+0x8/0x34 > > [ 177.125000] [<ffffffff8022e4f4>] warn_slowpath_common+0x78/0xa0 > > [ 177.132812] [<ffffffff8022e5b8>] warn_slowpath_fmt+0x38/0x44 > > [ 177.136718] [<ffffffff80522ed8>] radeon_fence_wait+0x25c/0x33c > > [ 177.144531] [<ffffffff804e9e70>] ttm_bo_wait+0x108/0x220 > > [ 177.148437] [<ffffffff8053b478>] radeon_gem_wait_idle_ioctl+0x80/0x114 > > [ 177.156250] [<ffffffff804d2fe8>] drm_ioctl+0x2e4/0x3fc > > [ 177.160156] [<ffffffff805a1820>] radeon_kms_compat_ioctl+0x28/0x38 > > [ 177.167968] [<ffffffff80311a04>] compat_sys_ioctl+0x120/0x35c > > [ 177.171875] [<ffffffff80211d18>] handle_sys+0x118/0x138 > > [ 177.179687] ---[ end trace 92f63d998efe4c6d ]--- > > [ 177.187500] radeon 0000:01:05.0: GPU softreset > > [ 177.191406] radeon 0000:01:05.0: R_008010_GRBM_STATUS=0xF57C2030 > > [ 177.195312] radeon 0000:01:05.0: R_008014_GRBM_STATUS2=0x00111103 > > [ 177.203125] radeon 0000:01:05.0: R_000E50_SRBM_STATUS=0x20023040 > > [ 177.363281] radeon 0000:01:05.0: Wait for MC idle timedout ! > > [ 177.367187] radeon 0000:01:05.0: R_008020_GRBM_SOFT_RESET=0x00007FEE > > [ 177.390625] radeon 0000:01:05.0: R_008020_GRBM_SOFT_RESET=0x00000001 > > [ 177.414062] radeon 0000:01:05.0: R_008010_GRBM_STATUS=0xA0003030 > > [ 177.417968] radeon 0000:01:05.0: R_008014_GRBM_STATUS2=0x00000003 > > [ 177.425781] radeon 0000:01:05.0: R_000E50_SRBM_STATUS=0x2002B040 > > [ 177.433593] radeon 0000:01:05.0: GPU reset succeed > > [ 177.605468] radeon 0000:01:05.0: Wait for MC idle timedout ! > > [ 177.761718] radeon 0000:01:05.0: Wait for MC idle timedout ! > > [ 177.804687] radeon 0000:01:05.0: WB enabled > > [ 178.000000] [drm:r600_ring_test] *ERROR* radeon: ring test failed > > (scratch(0x8504)=0xCAFEDEAD) > After pinned ring in VRAM, it warned an ib test failure. It seems > something wrong with accessing through GTT. > > We dump gart table just after stopped cp, and compare gart table with > the dumped one just after r600_pcie_gart_enable, and don't find any > difference. > > Any idea? > > > [ 178.007812] [drm:r600_resume] *ERROR* r600 startup failed on resume > > [ 178.988281] [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule > > IB(5). > > [ 178.996093] [drm:radeon_cs_ioctl] *ERROR* Failed to schedule IB ! > > [ 179.003906] [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule > > IB(6). > > ... > > Do you have any kind of iommu ? Is the gart table programmed with proper physical address for the page ? Is the GPU PCI master (iirc a PCI device need to be master to be able initiate request to memory). Then there could be a lot other PCI things getting in the way. Cheers, Jerome _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel