On Fri, Mar 10, 2017 at 06:51:27PM +0100, Wargreen wrote: > Do you think i should send an issue to DRM/Radeon ? More eyes certainly wouldn't hurt. > On 07/03/2017 19:41, Wargreen wrote: > > Sorry, i'm more a user than a debugger... > > > > So, the /proc/[PID]/stack output, with a gnome_shell: > > [<ffffffffc0b37e9f>] radeon_fence_default_wait+0xbf/0x170 [radeon] > > [<ffffffffae34227d>] fence_wait_timeout+0x9d/0x350 > > [<ffffffffc0a2dba5>] ttm_bo_vm_fault+0x455/0x530 [ttm] > > [<ffffffffc0b39e27>] radeon_ttm_fault+0x47/0x70 [radeon] > > [<ffffffffae00f9d1>] __do_fault+0x81/0x170 > > [<ffffffffae01377b>] handle_mm_fault+0x57b/0x1450 > > [<ffffffffade70459>] __do_page_fault+0x289/0x5d0 > > [<ffffffffade707c2>] do_page_fault+0x22/0x30 > > [<ffffffffae4fccf8>] page_fault+0x28/0x30 > > [<ffffffffffffffff>] 0xffffffffffffffff > > > > And the corresponding locks : > > Mar 7 19:15:49 LaChoze kernel: [ 239.624948] NOHZ: > > local_softirq_pending 80 > > Mar 7 19:15:57 LaChoze kernel: [ 247.718336] sysrq: SysRq : Show Locks > > Held > > Mar 7 19:15:57 LaChoze kernel: [ 247.718342] > > Mar 7 19:15:57 LaChoze kernel: [ 247.718342] Showing all locks held in > > the system: > > Mar 7 19:15:57 LaChoze kernel: [ 247.718371] 5 locks held by > > irq/1-i8042/126: > > Mar 7 19:15:57 LaChoze kernel: [ 247.718372] #0: > > (&serio->lock){+.+...}, at: [<ffffffffae34dab8>] serio_interrupt+0x28/0x80 > > Mar 7 19:15:57 LaChoze kernel: [ 247.718378] #1: > > (&dev->event_lock){+.+...}, at: [<ffffffffae353cba>] input_event+0x3a/0x60 > > Mar 7 19:15:57 LaChoze kernel: [ 247.718382] #2: > > (rcu_read_lock){......}, at: [<ffffffffae352ee5>] > > input_pass_values.part.5+0x5/0x270 > > Mar 7 19:15:57 LaChoze kernel: [ 247.718385] #3: > > (rcu_read_lock){......}, at: [<ffffffffae2c1eb5>] __handle_sysrq+0x5/0x220 > > Mar 7 19:15:57 LaChoze kernel: [ 247.718389] #4: > > (tasklist_lock){+.+...}, at: [<ffffffffadeeae1d>] > > debug_show_all_locks+0x3d/0x1a0 > > Mar 7 19:15:57 LaChoze kernel: [ 247.718412] 1 lock held by in:imklog/895: > > Mar 7 19:15:57 LaChoze kernel: [ 247.718412] #0: > > (&f->f_pos_lock){+.+.+.}, at: [<ffffffffae08883a>] __fdget_pos+0x4a/0x50 > > Mar 7 19:15:57 LaChoze kernel: [ 247.718435] 2 locks held by Xorg/2215: > > Mar 7 19:15:57 LaChoze kernel: [ 247.718435] #0: > > (&rdev->exclusive_lock){++++.+}, at: [<ffffffffadef3710>] > > rt_down_read+0x10/0x20 > > Mar 7 19:15:57 LaChoze kernel: [ 247.718439] #1: > > (&rdev->pm.mclk_lock){++++++}, at: [<ffffffffadef3710>] > > rt_down_read+0x10/0x20 > > Mar 7 19:15:57 LaChoze kernel: [ 247.718452] 2 locks held by > > gnome-shell/2412: > > Mar 7 19:15:57 LaChoze kernel: [ 247.718453] #0: > > (&rdev->pm.mclk_lock){++++++}, at: [<ffffffffadef3710>] > > rt_down_read+0x10/0x20 > > Mar 7 19:15:57 LaChoze kernel: [ 247.718456] #1: > > (reservation_ww_class_mutex){+.+.+.}, at: [<ffffffffc0a2d7b0>] > > ttm_bo_vm_fault+0x60/0x530 [ttm] > > Mar 7 19:15:57 LaChoze kernel: [ 247.718487] > > Mar 7 19:15:57 LaChoze kernel: [ 247.718487] > > ============================================= Hmm, there is nothing obviously wrong here, or at least nothing that sticks out to me. I see that there are a few fence tracepoints that the kernel has made available, perhaps you can run an execution in the working case with those tracepoints enabled (and probably sched/ irq/ events as well), and compare them to a test run where it fails. Beyond that, you might have to get help from folks more familiar with this particular driver :(. Julia
Attachment:
signature.asc
Description: PGP signature