On 2012.12.18 at 16:24 +0100, Maarten Lankhorst wrote: > Op 18-12-12 14:38, Markus Trippelsdorf schreef: > > On 2012.12.18 at 12:20 +0100, Michel Dänzer wrote: > >> On Mon, 2012-12-17 at 23:55 +0100, Markus Trippelsdorf wrote: > >>> On 2012.12.17 at 23:25 +0100, Markus Trippelsdorf wrote: > >>>> On 2012.12.17 at 17:00 -0500, Alex Deucher wrote: > >>>>> On Mon, Dec 17, 2012 at 4:48 PM, Markus Trippelsdorf > >>>>> <markus@xxxxxxxxxxxxxxx> wrote: > >>>>>> On 2012.12.17 at 16:32 -0500, Alex Deucher wrote: > >>>>>>> On Mon, Dec 17, 2012 at 1:27 PM, Markus Trippelsdorf > >>>>>>> <markus@xxxxxxxxxxxxxxx> wrote: > >>>>>>>> As soon as I open the following website: > >>>>>>>> http://www.boston.com/bigpicture/2012/12/2012_year_in_pictures_part_i.html > >>>>>>>> > >>>>>>>> my Radeon RS780 stalls (GPU lockup) leaving the machine unusable: > >>>>>>> Is this a regression? Most likely a 3D driver bug unless you are only > >>>>>>> seeing it with specific kernels. What browser are you using and do > >>>>>>> you have hw accelerated webgl, etc. enabled? If so, what version of > >>>>>>> mesa are you using? > >>>>>> This is a regression, because it is caused by yesterdays merge of > >>>>>> drm-next by Linus. IOW I only see this bug when running a > >>>>>> v3.7-9432-g9360b53 kernel. > >>>>> Can you bisect? I'm guessing it may be related to the new DMA rings. Possibly: > >>>>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=2d6cc7296d4ee128ab0fa3b715f0afde511f49c2 > >>>> Yes, the commit above causes the issue. > >>>> > >>>> 2d6cc72 GPU lockups > >>> With 2d6cc72 reverted I get: > >>> > >>> Dec 17 23:09:35 x4 kernel: ------------[ cut here ]------------ > >> Probably a separate issue, can you bisect this one as well? > > Yes. Git-bisect points to: > > > > 85b144f860176ec18db927d6d9ecdfb24d9c6483 is the first bad commit > > commit 85b144f860176ec18db927d6d9ecdfb24d9c6483 > > Author: Maarten Lankhorst <maarten.lankhorst@xxxxxxxxxxxxx> > > Date: Thu Nov 29 11:36:54 2012 +0000 > > > > drm/ttm: call ttm_bo_cleanup_refs with reservation and lru lock > > held, v3 > > > > (Please note that this bug is a little bit harder to reproduce. But > > when you scroll up and down for ~10 seconds on the webpage mentioned > > above it will trigger the oops. > > So while I'm not 100% sure that the issue is caused by exactly this > > commit, the vicinity should be right) > > > Those dmesg warnings sound suspicious, looks like something is going > very wrong there. > > Can you revert the one before it? "drm/radeon: allow move_notify to be > called without reservation" Reservation should be held at this point, > that commit got in accidentally. > > I doubt not holding a reservation is causing it though, I don't really > see how that commit could cause it however, so can you please double > check it never happened before that point, and only started at that > commit? > > also slap in a BUG_ON(!ttm_bo_is_reserved(bo)) in > ttm_bo_cleanup_refs_and_unlock for good measure, and a > BUG_ON(spin_trylock(&bdev->fence_lock)); to ttm_bo_wait. > > I really don't see how that specific commit can be wrong though, so > awaiting your results first before I try to dig more into it. I just reran git-bisect just on your commits (from 1a1494def to 97a875cbd) and I landed on the same commit as above: commit 85b144f86 (drm/ttm: call ttm_bo_cleanup_refs with reservation and lru lock held, v3) So now I'm pretty sure it's specifically this commit that started the issue. With your supposed debugging BUG_ONs added I still get: Dec 18 17:01:15 x4 kernel: ------------[ cut here ]------------ Dec 18 17:01:15 x4 kernel: WARNING: at include/linux/kref.h:42 radeon_fence_ref+0x2c/0x40() Dec 18 17:01:15 x4 kernel: Hardware name: System Product Name Dec 18 17:01:15 x4 kernel: Pid: 157, comm: X Not tainted 3.7.0-rc7-00520-g85b144f-dirty #174 Dec 18 17:01:15 x4 kernel: Call Trace: Dec 18 17:01:15 x4 kernel: [<ffffffff81058c84>] ? warn_slowpath_common+0x74/0xb0 Dec 18 17:01:15 x4 kernel: [<ffffffff8129273c>] ? radeon_fence_ref+0x2c/0x40 Dec 18 17:01:15 x4 kernel: [<ffffffff8125e95c>] ? ttm_bo_cleanup_refs_and_unlock+0x18c/0x2d0 Dec 18 17:01:15 x4 kernel: [<ffffffff8125f17c>] ? ttm_mem_evict_first+0x1dc/0x2a0 Dec 18 17:01:15 x4 kernel: [<ffffffff81264452>] ? ttm_bo_man_get_node+0x62/0xb0 Dec 18 17:01:15 x4 kernel: [<ffffffff8125f4ce>] ? ttm_bo_mem_space+0x28e/0x340 Dec 18 17:01:15 x4 kernel: [<ffffffff8125fb0c>] ? ttm_bo_move_buffer+0xfc/0x170 Dec 18 17:01:15 x4 kernel: [<ffffffff810de172>] ? kmem_cache_alloc+0xb2/0xc0 Dec 18 17:01:15 x4 kernel: [<ffffffff8125fc15>] ? ttm_bo_validate+0x95/0x110 Dec 18 17:01:15 x4 kernel: [<ffffffff8125ff7c>] ? ttm_bo_init+0x2ec/0x3b0 Dec 18 17:01:15 x4 kernel: [<ffffffff8129419a>] ? radeon_bo_create+0x18a/0x200 Dec 18 17:01:15 x4 kernel: [<ffffffff81293e80>] ? radeon_bo_clear_va+0x40/0x40 Dec 18 17:01:15 x4 kernel: [<ffffffff812a5342>] ? radeon_gem_object_create+0x92/0x160 Dec 18 17:01:15 x4 kernel: [<ffffffff812a575c>] ? radeon_gem_create_ioctl+0x6c/0x150 Dec 18 17:01:15 x4 kernel: [<ffffffff812a529f>] ? radeon_gem_object_free+0x2f/0x40 Dec 18 17:01:15 x4 kernel: [<ffffffff81246b60>] ? drm_ioctl+0x420/0x4f0 Dec 18 17:01:15 x4 kernel: [<ffffffff812a56f0>] ? radeon_gem_pwrite_ioctl+0x20/0x20 Dec 18 17:01:15 x4 kernel: [<ffffffff810f53a4>] ? do_vfs_ioctl+0x2e4/0x4e0 Dec 18 17:01:15 x4 kernel: [<ffffffff810e5588>] ? vfs_read+0x118/0x160 Dec 18 17:01:15 x4 kernel: [<ffffffff810f55ec>] ? sys_ioctl+0x4c/0xa0 Dec 18 17:01:15 x4 kernel: [<ffffffff810e5851>] ? sys_read+0x51/0xa0 Dec 18 17:01:15 x4 kernel: [<ffffffff814b0612>] ? system_call_fastpath+0x16/0x1b Dec 18 17:01:15 x4 kernel: ---[ end trace 485a2dd5755db51e ]--- Dec 18 17:01:15 x4 kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000024 Dec 18 17:01:15 x4 kernel: IP: [<ffffffff81296488>] radeon_vm_bo_invalidate+0x18/0x30 Dec 18 17:01:15 x4 kernel: PGD 211d09067 PUD 211d52067 PMD 0 Dec 18 17:01:15 x4 kernel: Oops: 0002 [#1] SMP Dec 18 17:01:15 x4 kernel: CPU 1 Dec 18 17:01:15 x4 kernel: Pid: 157, comm: X Tainted: G W 3.7.0-rc7-00520-g85b144f-dirty #174 System manufacturer System Product Name/M4A78T-E Dec 18 17:01:15 x4 kernel: RIP: 0010:[<ffffffff81296488>] [<ffffffff81296488>] radeon_vm_bo_invalidate+0x18/0x30 Dec 18 17:01:15 x4 kernel: RSP: 0018:ffff880211ddfaa8 EFLAGS: 00010203 Dec 18 17:01:15 x4 kernel: RAX: 0000000000000000 RBX: ffff8801f94e1c48 RCX: ffff880205de3128 Dec 18 17:01:15 x4 kernel: RDX: 0000000000000001 RSI: ffff8801f94e1df0 RDI: ffff8801f94e1df8 Dec 18 17:01:15 x4 kernel: RBP: 0000000000000002 R08: 0000000000000000 R09: 0000000000000000 Dec 18 17:01:15 x4 kernel: R10: 0000000000000000 R11: ffff880216a766b8 R12: ffff880216a76590 Dec 18 17:01:15 x4 kernel: R13: ffffffff818383e0 R14: 0000000000000001 R15: ffff880215c83678 Dec 18 17:01:15 x4 kernel: FS: 00007fbcabc8c880(0000) GS:ffff88021fc80000(0000) knlGS:0000000000000000 Dec 18 17:01:15 x4 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Dec 18 17:01:15 x4 kernel: CR2: 0000000000000024 CR3: 0000000211d07000 CR4: 00000000000007e0 Dec 18 17:01:15 x4 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Dec 18 17:01:15 x4 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Dec 18 17:01:15 x4 kernel: Process X (pid: 157, threadinfo ffff880211dde000, task ffff880211dc0ba0) Dec 18 17:01:15 x4 kernel: Stack: Dec 18 17:01:15 x4 kernel: ffffffff8125d2e9 ffff8801f94e1c48 ffffffff8125e909 ffff880216a769b8 Dec 18 17:01:15 x4 kernel: 01ff880200000001 ffff8801f94e1c84 0000000000000001 ffff880216a766b8 Dec 18 17:01:15 x4 kernel: 0000000000000000 ffff880215c83678 ffff8801f94e1c48 ffffffff8125f17c Dec 18 17:01:15 x4 kernel: Call Trace: Dec 18 17:01:15 x4 kernel: [<ffffffff8125d2e9>] ? ttm_bo_cleanup_memtype_use+0x19/0x90 Dec 18 17:01:15 x4 kernel: [<ffffffff8125e909>] ? ttm_bo_cleanup_refs_and_unlock+0x139/0x2d0 Dec 18 17:01:15 x4 kernel: [<ffffffff8125f17c>] ? ttm_mem_evict_first+0x1dc/0x2a0 Dec 18 17:01:15 x4 kernel: [<ffffffff81264452>] ? ttm_bo_man_get_node+0x62/0xb0 Dec 18 17:01:15 x4 kernel: [<ffffffff8125f4ce>] ? ttm_bo_mem_space+0x28e/0x340 Dec 18 17:01:15 x4 kernel: [<ffffffff8125fb0c>] ? ttm_bo_move_buffer+0xfc/0x170 Dec 18 17:01:15 x4 kernel: [<ffffffff810de172>] ? kmem_cache_alloc+0xb2/0xc0 Dec 18 17:01:15 x4 kernel: [<ffffffff8125fc15>] ? ttm_bo_validate+0x95/0x110 Dec 18 17:01:15 x4 kernel: [<ffffffff8125ff7c>] ? ttm_bo_init+0x2ec/0x3b0 Dec 18 17:01:15 x4 kernel: [<ffffffff8129419a>] ? radeon_bo_create+0x18a/0x200 Dec 18 17:01:15 x4 kernel: [<ffffffff81293e80>] ? radeon_bo_clear_va+0x40/0x40 Dec 18 17:01:15 x4 kernel: [<ffffffff812a5342>] ? radeon_gem_object_create+0x92/0x160 Dec 18 17:01:15 x4 kernel: [<ffffffff812a575c>] ? radeon_gem_create_ioctl+0x6c/0x150 Dec 18 17:01:15 x4 kernel: [<ffffffff81246b60>] ? drm_ioctl+0x420/0x4f0 Dec 18 17:01:15 x4 kernel: [<ffffffff812a56f0>] ? radeon_gem_pwrite_ioctl+0x20/0x20 Dec 18 17:01:15 x4 kernel: [<ffffffff8111c310>] ? fsnotify_clear_marks_by_inode+0x20/0xd0 Dec 18 17:01:15 x4 kernel: [<ffffffff810fbc35>] ? __destroy_inode+0x15/0x60 Dec 18 17:01:15 x4 kernel: [<ffffffff810de220>] ? kmem_cache_free+0x10/0x90 Dec 18 17:01:15 x4 kernel: [<ffffffff810f8eaf>] ? dput+0x2f/0x300 Dec 18 17:01:15 x4 kernel: [<ffffffff810f53a4>] ? do_vfs_ioctl+0x2e4/0x4e0 Dec 18 17:01:15 x4 kernel: [<ffffffff811005fb>] ? mntput_no_expire+0x7b/0x170 Dec 18 17:01:15 x4 kernel: [<ffffffff8107bb6b>] ? lg_global_unlock+0x3b/0x50 Dec 18 17:01:15 x4 kernel: [<ffffffff81071b9c>] ? task_work_run+0x8c/0xc0 Dec 18 17:01:15 x4 kernel: [<ffffffff810f55ec>] ? sys_ioctl+0x4c/0xa0 Dec 18 17:01:15 x4 kernel: [<ffffffff814b0612>] ? system_call_fastpath+0x16/0x1b Dec 18 17:01:15 x4 kernel: Code: 8b 44 24 04 48 83 c4 08 5b 5d 41 5c c3 66 0f 1f 44 00 00 48 8b 86 f0 01 00 00 48 81 c6 f0 01 00 00 48 39 f0 74 11 0f 1f 44 00 00 <c6> 40 24 00 48 8b 00 48 39 f0 75 f4 f3 c3 66 2e 0f 1f 84 00 00 Dec 18 17:01:15 x4 kernel: RIP [<ffffffff81296488>] radeon_vm_bo_invalidate+0x18/0x30 Dec 18 17:01:15 x4 kernel: RSP <ffff880211ddfaa8> Dec 18 17:01:15 x4 kernel: CR2: 0000000000000024 Dec 18 17:01:15 x4 kernel: ---[ end trace 485a2dd5755db51f ]--- Dec 18 17:01:15 x4 kernel: [drm:drm_release] *ERROR* Device busy: 1 -- Markus _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel