On Tue, Jan 24, 2012 at 8:34 AM, Torsten Kaiser <just.for.lkml@xxxxxxxxxxxxxx> wrote: > On Mon, Jan 23, 2012 at 7:01 PM, Torsten Kaiser > <just.for.lkml@xxxxxxxxxxxxxx> wrote: >> On Mon, Jan 23, 2012 at 5:57 PM, Jerome Glisse <j.glisse@xxxxxxxxx> wrote: >>> On Sat, Jan 21, 2012 at 08:03:37PM +0100, Torsten Kaiser wrote: >>>> After updating to kernel 3.3-rc1 I have experienced a lockup of my GPU. >>>> I left my KDE desktop running until the screensaver turned off the >>>> monitors. But on key presses it would not turn back on. Ctrl+Alt+F1 to >>>> switch to another virtual console also did not work. >>>> Alt+SysRq magic still worked, so I was able to force the syslog to >>>> disk and restart the system. >>>> >>> >>> Can you test if attached patch help your case ? >> >> Patch is installed, but I can't reproduce the hang on demand. >> It did happen a second time yesterday while letting the screensaver >> kick in, but only at around the third or fourth try. Just using "xset >> dpms force standby/suspend/off" did not trigger it. > > I think the patch did what it was intended to do, but it did not really help. > While the GPU reset did seem to work, X still got stuck and was not > able to turn the monitors back on. > > From the log: > The GPU lockup happend while the system was idle: > Jan 23 23:53:54 thoregon kernel: [17121.080129] radeon 0000:07:00.0: > GPU lockup CP stall for more than 10000msec > Jan 23 23:53:54 thoregon kernel: [17121.080137] GPU lockup (waiting > for 0x002080B7 last fence id 0x002080B6) > Jan 23 23:53:54 thoregon kernel: [17121.096334] radeon 0000:07:00.0: > GPU softreset > Jan 23 23:53:54 thoregon kernel: [17121.096341] radeon 0000:07:00.0: > R_008010_GRBM_STATUS=0xA0003028 > Jan 23 23:53:54 thoregon kernel: [17121.096346] radeon 0000:07:00.0: > R_008014_GRBM_STATUS2=0x00000002 > Jan 23 23:53:54 thoregon kernel: [17121.096351] radeon 0000:07:00.0: > R_000E50_SRBM_STATUS=0x200000C0 > Jan 23 23:53:54 thoregon kernel: [17121.096362] radeon 0000:07:00.0: > R_008020_GRBM_SOFT_RESET=0x00007FEE > Jan 23 23:53:54 thoregon kernel: [17121.111386] radeon 0000:07:00.0: > R_008020_GRBM_SOFT_RESET=0x00000001 > Jan 23 23:53:54 thoregon kernel: [17121.127378] radeon 0000:07:00.0: > R_008010_GRBM_STATUS=0x00003028 > Jan 23 23:53:54 thoregon kernel: [17121.127384] radeon 0000:07:00.0: > R_008014_GRBM_STATUS2=0x00000002 > Jan 23 23:53:54 thoregon kernel: [17121.127390] radeon 0000:07:00.0: > R_000E50_SRBM_STATUS=0x200000C0 > Jan 23 23:53:54 thoregon kernel: [17121.128393] radeon 0000:07:00.0: > GPU reset succeed > Jan 23 23:53:54 thoregon kernel: [17121.133330] [drm] PCIE GART of > 512M enabled (table at 0x0000000000040000). > Jan 23 23:53:54 thoregon kernel: [17121.133364] radeon 0000:07:00.0: WB enabled > Jan 23 23:53:54 thoregon kernel: [17121.133370] [drm] fence driver on > ring 0 use gpu addr 0x20000c00 and cpu addr 0xffff8803286e5c00 > Jan 23 23:53:54 thoregon kernel: [17121.179627] [drm] ring test on 0 > succeeded in 1 usecs > Jan 23 23:53:54 thoregon kernel: [17121.179653] [drm] ib test on ring > 0 succeeded in 1 usecs I found the commit (in xf86-video-ati) that causes the lockups and filed a bug at the xorg bugzilla about it: https://bugs.freedesktop.org/show_bug.cgi?id=45329 But that still leaves the regression in 3.3-rc1 that even with Jeromes patch the X server is no longer able to recover from the lockup, as shown by the SysRq+W trace below. > There where no messages about X getting stuck ("blocked for more than > 120 seconds"), but after trying to access the system and failing > SysRq+W reported this: > Jan 24 08:08:20 thoregon kernel: [46786.741180] SysRq : Show Blocked State > Jan 24 08:08:20 thoregon kernel: [46786.741190] task > PC stack pid father > Jan 24 08:08:20 thoregon kernel: [46786.741270] X D > ffff880337d50a00 0 3047 3026 0x00400004 > Jan 24 08:08:20 thoregon kernel: [46786.741281] ffff880327eacac0 > 0000000000000086 ffff880327d52e00 0000000000010a00 > Jan 24 08:08:20 thoregon kernel: [46786.741292] ffff88031be9bfd8 > 0000000000010a00 ffff88031be9a000 ffff88031be9bfd8 > Jan 24 08:08:20 thoregon kernel: [46786.741301] 0000000000010a00 > ffff880327eacac0 0000000000010a00 0000000000010a00 > Jan 24 08:08:20 thoregon kernel: [46786.741310] Call Trace: > Jan 24 08:08:20 thoregon kernel: [46786.741326] [<ffffffff815ee9f7>] > ? schedule_timeout+0x157/0x220 > Jan 24 08:08:20 thoregon kernel: [46786.741336] [<ffffffff8103fbd0>] > ? run_timer_softirq+0x240/0x240 > Jan 24 08:08:20 thoregon kernel: [46786.741346] [<ffffffff8133ee39>] > ? radeon_fence_wait+0x239/0x3b0 > Jan 24 08:08:20 thoregon kernel: [46786.741356] [<ffffffff8104f340>] > ? wake_up_bit+0x40/0x40 > Jan 24 08:08:20 thoregon kernel: [46786.741364] [<ffffffff81352e07>] > ? radeon_ib_get+0x257/0x2e0 > Jan 24 08:08:20 thoregon kernel: [46786.741372] [<ffffffff81354d7a>] > ? radeon_cs_ioctl+0x27a/0x4d0 > Jan 24 08:08:20 thoregon kernel: [46786.741381] [<ffffffff812f42d4>] > ? drm_ioctl+0x3e4/0x490 > Jan 24 08:08:20 thoregon kernel: [46786.741389] [<ffffffff81354b00>] > ? radeon_cs_finish_pages+0xa0/0xa0 > Jan 24 08:08:20 thoregon kernel: [46786.741398] [<ffffffff81024769>] > ? do_page_fault+0x199/0x420 > Jan 24 08:08:20 thoregon kernel: [46786.741406] [<ffffffff810af30c>] > ? mmap_region+0x1dc/0x570 > Jan 24 08:08:20 thoregon kernel: [46786.741414] [<ffffffff810de446>] > ? do_vfs_ioctl+0x96/0x4e0 > Jan 24 08:08:20 thoregon kernel: [46786.741422] [<ffffffff810de8d9>] > ? sys_ioctl+0x49/0x90 > Jan 24 08:08:20 thoregon kernel: [46786.741430] [<ffffffff815f1922>] > ? system_call_fastpath+0x16/0x1b > > I did search my logs for more GPU lockups after noting that this also > happened with 3.2. > The first lockup in my logs occurred on Nov 4 under 3.1. But until > 3.3-rc1 X always was able to resume normal operations. > > My best guess for the cause of the GPU lockups seems to be the upgrade > from xf86-video-ati-6.14.2 to 6.14.3, but 3.3-rc1 seems to have an > independent bug that prevents X to recover from a GPU lockup/reset. > >>> Of course it would be best if we did not lockup in the first place. >> >> Not sure if this is important: I also upgraded to mesa 8.0-rc1 before >> the first hang, but after switching back to 3.2 but still using mesa >> 8.0 I did not have any problems. >> Except the KDE desktop effects there should not have been any OpenGL >> programs running. >> The screen saver itself is just turning the screens off via the KDE >> power profile. >> >> I will report again, when I succeeded in triggering the GPU lockup again... >> >> Torsten _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel