On 16.03.2015 23:52, Carsten Emde wrote: > Hi Michel, > >>> [..] >>> The most striking problem of kernel 3.18.9-rt4 affects all systems that >>> are equipped with Radeon graphics (irrespective whether PCIe cards or >>> APUs with on-chip graphics). They suffer from a hanging radeon driver. >>> The block occurs when accelerated graphics load is created by x11perf or >>> gltestperf. Sometimes only the graphics are frozen while ssh login still >>> is possible, somtimes the entire box is no longer accessible at all. In >>> any case, a reboot is needed to recover from this situation. >>> >>> Here is a selection of kernel messages: >> [...] >> The commits from >> http://cgit.freedesktop.org/~airlied/linux/commit/?h=drm-fixes&id=f957063fee6392bb9365370db6db74dc0b2dce0a >> >> to >> http://cgit.freedesktop.org/~airlied/linux/commit/?h=drm-fixes&id=cffefd9bb31cd35ab745d3b49005d10616d25bdc >> >> and >> http://cgit.freedesktop.org/~airlied/linux/commit/?h=drm-fixes&id=b6610101718d4ab90d793c482625e98eb1262cad >> >> might help for this. > > Thanks a lot. I have applied these patches to a number of systems: > # quilt applied | tail -7 > patches/drm-radeon-do-a-posting-read-in-r100_set_irq.patch > patches/drm-radeon-do-a-posting-read-in-rs600_set_irq.patch > patches/drm-radeon-do-a-posting-read-in-r600_set_irq.patch > patches/drm-radeon-do-a-posting-read-in-evergreen_set_irq.patch > patches/drm-radeon-do-a-posting-read-in-si_set_irq.patch > patches/drm-radeon-do-a-posting-read-in-cik_set_irq.patch > patches/drm-radeon-fix-wait-to-actually-occur-after-the-signaling-callback.patch > > > The graphic boards still crash and freeze the screen, but in contrast > to the earlier situation the systems remain accessible, and the X > Window server can be restarted after the offensive programs are > removed. The crashes were reliably triggered by > - gltestperf > or > - x11perf -repeat 3 -subs 25 -time 2 -rect10 > but the crashes also occur several times per day during normal work > such as browsing the Internet or writing a text document. If you wish > me to provide additional diagnostic information such as running test > programs while the graphic boards are unresponsive, I certainly can do > that. Does it also happen with a kernel built from a current drm-fixes tree? http://cgit.freedesktop.org/~airlied/linux/log/?h=drm-fixes I might have missed other needed fixes. > Rack #0/Slot #3 [AMD/ATI] RV730 XT [Radeon HD 4670]: > > [21001.244036] INFO: task kworker/u24:6:267 blocked for more than 120 seconds. > [21001.257773] Not tainted 3.18.9-rt4 #27 > [21001.266284] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [21001.281911] kworker/u24:6 D ffff88081ed8b340 0 267 2 0x10000000 > [21001.281937] Workqueue: radeon-crtc radeon_flip_work_func [radeon] > [21001.281940] ffff880805d2fbe8 0000000000000046 ffff88081ed0c700 0000000000000000 > [21001.281941] 0000000000009000 000000000000c920 ffff8808112fb420 ffff880035254e30 > [21001.281943] 000000000000c280 000001000000c280 0000000000000003 ffff880035254e30 > [21001.281945] Call Trace: > [21001.281950] [<ffffffff81721ce4>] schedule+0x34/0xa0 > [21001.281953] [<ffffffff8172425c>] schedule_timeout+0x22c/0x2d0 > [21001.281962] [<ffffffffa0439a06>] ? radeon_fence_process+0x16/0x40 [radeon] > [21001.281971] [<ffffffffa0439a74>] ? radeon_fence_any_seq_signaled+0x44/0x90 [radeon] > [21001.281979] [<ffffffffa0439da7>] radeon_fence_wait_seq_timeout.constprop.8+0x2e7/0x340 [radeon] > [21001.281982] [<ffffffff81098be0>] ? __wake_up_sync+0x20/0x20 > [21001.281991] [<ffffffffa043a106>] radeon_fence_wait+0x86/0xc0 [radeon] > [21001.282000] [<ffffffffa0447eec>] radeon_flip_work_func+0x15c/0x190 [radeon] > [21001.282003] [<ffffffff810709c4>] process_one_work+0x154/0x450 > [21001.282004] [<ffffffff81070fbb>] worker_thread+0x6b/0x4d0 > [21001.282006] [<ffffffff81070f50>] ? rescuer_thread+0x290/0x290 > [21001.282007] [<ffffffff81070f50>] ? rescuer_thread+0x290/0x290 > [21001.282009] [<ffffffff81075fed>] kthread+0xcd/0xf0 > [21001.282010] [<ffffffff81075f20>] ? kthread_worker_fn+0x1d0/0x1d0 > [21001.282013] [<ffffffff81725aec>] ret_from_fork+0x7c/0xb0 > [21001.282014] [<ffffffff81075f20>] ? kthread_worker_fn+0x1d0/0x1d0 > > > Rack #0/Slot #7 [AMD/ATI] Cayman XT [Radeon HD 6970] > > [ 481.091132] INFO: task Xorg:3459 blocked for more than 120 seconds. > [ 481.103594] Not tainted 3.18.9-rt4 #28 > [ 481.112101] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [ 481.127746] Xorg D ffff88041e68ab40 0 3459 3452 0x10400004 > [ 481.141882] ffff880413da38e8 0000000000000002 ffff88041e60c460 ffff8800c3ea3380 > [ 481.141882] ffff880413da38d8 ffffffff8108603f 000000000000c5a8 000000000000c5c8 > [ 481.141883] ffffffff81c19460 ffff8800c3ea3380 000000000000000c ffff8800c3ea3380 > [ 481.186228] Call Trace: > [ 481.191114] [<ffffffff8108603f>] ? queue_delayed_work_on+0xff/0x110 > [ 481.191118] [<ffffffff816b50f4>] schedule+0x34/0xa0 > [ 481.191119] [<ffffffff816b72f4>] schedule_timeout+0x204/0x270 > [ 481.191148] [<ffffffffa00cd826>] ? radeon_fence_process+0x16/0x40 [radeon] > [ 481.191157] [<ffffffffa00cd894>] ? radeon_fence_any_seq_signaled+0x44/0x90 [radeon] > [ 481.191165] [<ffffffffa00cdb07>] radeon_fence_wait_seq_timeout.constprop.7+0x227/0x330 [radeon] > [ 481.191167] [<ffffffff810ac310>] ? prepare_to_wait_event+0x110/0x110 > [ 481.191175] [<ffffffffa00cdf67>] radeon_fence_wait_any+0x57/0x70 [radeon] > [ 481.191191] [<ffffffffa01432af>] radeon_sa_bo_new+0x2cf/0x4e0 [radeon] > [ 481.191194] [<ffffffff8133c2a7>] ? debug_smp_processor_id+0x17/0x20 > [ 481.191207] [<ffffffffa019d3e7>] radeon_ib_get+0x37/0xf0 [radeon] > [ 481.191218] [<ffffffffa00e997d>] radeon_cs_ioctl+0x22d/0x820 [radeon] > [ 481.191219] [<ffffffff8133c2a7>] ? debug_smp_processor_id+0x17/0x20 > [ 481.191228] [<ffffffffa001bc04>] drm_ioctl+0x1a4/0x630 [drm] > [ 481.191231] [<ffffffff8133c2a7>] ? debug_smp_processor_id+0x17/0x20 > [ 481.191234] [<ffffffff8106e8da>] ? unpin_current_cpu+0x1a/0x70 > [ 481.191237] [<ffffffff81097440>] ? migrate_enable+0xb0/0x1b0 > [ 481.191243] [<ffffffffa00b004b>] radeon_drm_ioctl+0x4b/0x80 [radeon] > [ 481.191245] [<ffffffff811c7040>] do_vfs_ioctl+0x2e0/0x4d0 > [ 481.191247] [<ffffffff811d1aa2>] ? __fget+0x72/0xa0 > [ 481.191248] [<ffffffff811c72b1>] SyS_ioctl+0x81/0xa0 > [ 481.191250] [<ffffffff816b8cb2>] tracesys_phase2+0xd4/0xd9 > > > Rack #0/Slot #8 [AMD/ATI] Tahiti XT [Radeon HD 7970/8970 OEM / R9 280X]: > > [19579.220958] INFO: task Xorg.bin:16569 blocked for more than 120 seconds. > [19579.228008] Not tainted 3.18.9-rt4 #25 > [19579.232491] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [19579.240719] Xorg.bin D ffffffff81716c70 0 16569 16215 0x10400080 > [19579.248076] ffff8805f78bf818 0000000000000002 ffff8805f78bf7f8 0000000000000002 > [19579.248077] 000000000000dc08 ffff880626a0dc08 000000000000dbe8 000000000000dc08 > [19579.248078] ffffffff81c1b500 ffff880606c614a0 ffff880614f7c000 ffff880606c614a0 > [19579.271393] Call Trace: > [19579.273964] [<ffffffff81713da4>] schedule+0x34/0xa0 > [19579.273965] [<ffffffff817162dc>] schedule_timeout+0x1fc/0x280 > [19579.273990] [<ffffffffa00c7aa6>] ? radeon_fence_process+0x16/0x40 [radeon] > [19579.273999] [<ffffffffa00c7b14>] ? radeon_fence_any_seq_signaled+0x44/0x90 [radeon] > [19579.274008] [<ffffffffa00c7e47>] radeon_fence_wait_seq_timeout.constprop.8+0x2e7/0x340 [radeon] > [19579.274011] [<ffffffff810cf310>] ? __wake_up_sync+0x20/0x20 > [19579.274020] [<ffffffffa00c8237>] radeon_fence_wait_any+0x57/0x70 [radeon] > [19579.274035] [<ffffffffa013e2cf>] radeon_sa_bo_new+0x2af/0x4b0 [radeon] > [19579.274049] [<ffffffffa0196077>] radeon_ib_get+0x37/0xe0 [radeon] > [19579.274062] [<ffffffffa0194bbc>] radeon_vm_update_page_directory+0x6c/0x290 [radeon] > [19579.274078] [<ffffffffa0144916>] ? si_ib_parse+0x396/0x430 [radeon] > [19579.274089] [<ffffffffa00e44ab>] radeon_cs_ioctl+0x35b/0x850 [radeon] > [19579.274098] [<ffffffffa0005bc7>] drm_ioctl+0x197/0x670 [drm] > [19579.274102] [<ffffffff81373337>] ? debug_smp_processor_id+0x17/0x20 > [19579.274103] [<ffffffff8108ec2a>] ? unpin_current_cpu+0x1a/0x80 > [19579.274105] [<ffffffff810b85c4>] ? migrate_enable+0x84/0x160 > [19579.274111] [<ffffffffa00aa04c>] radeon_drm_ioctl+0x4c/0x80 [radeon] > [19579.274114] [<ffffffff811f8ae8>] do_vfs_ioctl+0x2c8/0x4c0 > [19579.274116] [<ffffffff81203902>] ? __fget+0x72/0xb0 > [19579.274117] [<ffffffff811f8d61>] SyS_ioctl+0x81/0xa0 > [19579.274118] [<ffffffff817179de>] tracesys_phase2+0xd4/0xd9 > > > Rack #4/Slot #1 Chipset: "KAVERI" (ChipID = 0x130c): > > [21721.088164] INFO: task Xorg:7436 blocked for more than 120 seconds. > [21721.100625] Not tainted 3.18.9-rt4 #26 > [21721.109150] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [21721.124795] Xorg D ffffffff816b7f88 0 7436 7430 0x10400004 > [21721.138897] ffff880409f278e8 0000000000000002 ffff88041e90c460 000000000000c5c8 > [21721.138898] ffff88041e90c5c8 0000000000000006 000000000000c5a8 000000000000c5c8 > [21721.138899] ffff8804177299c0 ffff880409f299c0 000000000000000c ffff880409f299c0 > [21721.183222] Call Trace: > [21721.188110] [<ffffffff816b50f4>] schedule+0x34/0xa0 > [21721.188112] [<ffffffff816b72f4>] schedule_timeout+0x204/0x270 > [21721.188143] [<ffffffffa00cd826>] ? radeon_fence_process+0x16/0x40 [radeon] > [21721.188153] [<ffffffffa00cd894>] ? radeon_fence_any_seq_signaled+0x44/0x90 [radeon] > [21721.188163] [<ffffffffa00cdb07>] radeon_fence_wait_seq_timeout.constprop.7+0x227/0x330 [radeon] > [21721.188165] [<ffffffff810ac310>] ? prepare_to_wait_event+0x110/0x110 > [21721.188176] [<ffffffffa00cdf67>] radeon_fence_wait_any+0x57/0x70 [radeon] > [21721.188193] [<ffffffffa01432af>] radeon_sa_bo_new+0x2cf/0x4e0 [radeon] > [21721.188196] [<ffffffff8133c2a7>] ? debug_smp_processor_id+0x17/0x20 > [21721.188210] [<ffffffffa019d3e7>] radeon_ib_get+0x37/0xf0 [radeon] > [21721.188223] [<ffffffffa00e997d>] radeon_cs_ioctl+0x22d/0x820 [radeon] > [21721.188233] [<ffffffffa001bc04>] drm_ioctl+0x1a4/0x630 [drm] > [21721.188236] [<ffffffff8133c2a7>] ? debug_smp_processor_id+0x17/0x20 > [21721.188238] [<ffffffff8106e8da>] ? unpin_current_cpu+0x1a/0x70 > [21721.188240] [<ffffffff81097440>] ? migrate_enable+0xb0/0x1b0 > [21721.188248] [<ffffffffa00b004b>] radeon_drm_ioctl+0x4b/0x80 [radeon] > [21721.188250] [<ffffffff811c7040>] do_vfs_ioctl+0x2e0/0x4d0 > [21721.188252] [<ffffffff811d1aa2>] ? __fget+0x72/0xa0 > [21721.188254] [<ffffffff811c72b1>] SyS_ioctl+0x81/0xa0 > [21721.188255] [<ffffffff816b8cb2>] tracesys_phase2+0xd4/0xd9 > > > Rack #c/Slot #5 Chipsed: "ATI Radeon HD 5800 Series" (ChipID = 0x6898) > > [19711.965733] INFO: task kworker/u24:13:197 blocked for more than 120 seconds. > [19711.965737] Not tainted 3.18.9-rt4 #26 > [19711.965749] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [19711.965751] kworker/u24:13 D ffff88032901a560 0 197 2 0x10000000 > [19711.965784] Workqueue: radeon-crtc radeon_flip_work_func [radeon] > [19711.965788] ffff880328b3bc58 0000000000000002 000000000001d65e 0000000000000000 > [19711.965789] ffff880328b3bfd8 000000000008a5c0 ffff880328b3bc78 ffffffffa0482589 > [19711.965791] ffff88032fa81920 ffff880328b30000 ffff88032c63d5f0 ffff880328b30000 > [19711.965794] Call Trace: > [19711.965813] [<ffffffffa0482589>] ? radeon_fence_activity+0x160/0x172 [radeon] > [19711.965818] [<ffffffff814e0d38>] schedule+0x7e/0x90 > [19711.965820] [<ffffffff814e2143>] schedule_timeout+0x25/0xd3 > [19711.965835] [<ffffffffa0482ba3>] ? radeon_fence_any_seq_signaled+0x52/0x69 [radeon] > [19711.965850] [<ffffffffa0482d8d>] radeon_fence_wait_seq_timeout.constprop.6+0x1d3/0x2be [radeon] > [19711.965853] [<ffffffff81066166>] ? __wake_up_sync+0x12/0x12 > [19711.965869] [<ffffffffa04830e1>] radeon_fence_wait+0x92/0xaa [radeon] > [19711.965886] [<ffffffffa048dae1>] radeon_flip_work_func+0x11e/0x14f [radeon] > [19711.965889] [<ffffffff8104cac1>] process_one_work+0x16e/0x2ae > [19711.965891] [<ffffffff8104d0fe>] worker_thread+0x1df/0x2ca > [19711.965892] [<ffffffff8104cf1f>] ? cancel_delayed_work+0x91/0x91 > [19711.965894] [<ffffffff8104cf1f>] ? cancel_delayed_work+0x91/0x91 > [19711.965895] [<ffffffff81051324>] kthread+0xae/0xb6 > [19711.965897] [<ffffffff81051276>] ? __kthread_parkme+0x61/0x61 > [19711.965899] [<ffffffff814e322c>] ret_from_fork+0x7c/0xb0 > [19711.965901] [<ffffffff81051276>] ? __kthread_parkme+0x61/0x61 > [19711.965916] INFO: task compiz:2626 blocked for more than 120 seconds. > [19711.965929] Not tainted 3.18.9-rt4 #26 > [19711.965931] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [19711.965932] compiz D ffff88032901a560 0 2626 2186 0x30020000 > [19711.965937] ffff8800b8ee7bc8 0000000000200002 ffff88032bb9e480 0000000000000000 > [19711.965942] ffff8800b8ee7fd8 000000000008a5c0 0000000000000000 ffff8800b8ee7ee0 > [19711.965951] ffffffff81a25450 ffff88032bb9e480 ffff8800b8ee7c28 ffff88032bb9e480 > [19711.965954] Call Trace: > [19711.965958] [<ffffffff814e0d38>] schedule+0x7e/0x90 > [19711.965959] [<ffffffff814e1ab7>] __rt_mutex_slowlock+0x9f/0xdc > [19711.965961] [<ffffffff814e1f7b>] rt_mutex_slowlock+0x123/0x236 > [19711.965964] [<ffffffff8106b234>] rt_mutex_fastlock.constprop.24+0x2e/0x30 > [19711.965965] [<ffffffff814e2103>] rt_mutex_lock+0x13/0x15 > [19711.965967] [<ffffffff8106b613>] __rt_down_read.isra.1+0x29/0x30 > [19711.965968] [<ffffffff8106b628>] rt_down_read+0xe/0x10 > [19711.965988] [<ffffffffa04942ff>] radeon_gem_create_ioctl+0x2c/0xc6 [radeon] > [19711.965990] [<ffffffff812004f9>] ? avc_has_perm_noaudit+0xf7/0x109 > [19711.966004] [<ffffffffa010bc26>] drm_ioctl+0x380/0x3f8 [drm] > [19711.966025] [<ffffffffa04942d3>] ? radeon_gem_pwrite_ioctl+0x28/0x28 [radeon] > [19711.966027] [<ffffffff81200ca6>] ? inode_has_perm+0x2f/0x34 > [19711.966029] [<ffffffff81200e58>] ? file_has_perm+0x5d/0x81 > [19711.966040] [<ffffffffa046e00e>] radeon_drm_ioctl+0xe/0x10 [radeon] > [19711.966067] [<ffffffffa0518b9c>] radeon_kms_compat_ioctl+0x1b/0x1f [radeon] > [19711.966070] [<ffffffff8115e692>] compat_SyS_ioctl+0x1c3/0xf6e > [19711.966072] [<ffffffff8100e7b1>] ? syscall_trace_enter+0x52/0x57 > [19711.966074] [<ffffffff814e5679>] ia32_do_call+0x13/0x13 -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel