Op 18-12-12 17:12, Markus Trippelsdorf schreef: > On 2012.12.18 at 16:24 +0100, Maarten Lankhorst wrote: >> Op 18-12-12 14:38, Markus Trippelsdorf schreef: >>> On 2012.12.18 at 12:20 +0100, Michel Dänzer wrote: >>>> On Mon, 2012-12-17 at 23:55 +0100, Markus Trippelsdorf wrote: >>>>> On 2012.12.17 at 23:25 +0100, Markus Trippelsdorf wrote: >>>>>> On 2012.12.17 at 17:00 -0500, Alex Deucher wrote: >>>>>>> On Mon, Dec 17, 2012 at 4:48 PM, Markus Trippelsdorf >>>>>>> <markus@xxxxxxxxxxxxxxx> wrote: >>>>>>>> On 2012.12.17 at 16:32 -0500, Alex Deucher wrote: >>>>>>>>> On Mon, Dec 17, 2012 at 1:27 PM, Markus Trippelsdorf >>>>>>>>> <markus@xxxxxxxxxxxxxxx> wrote: >>>>>>>>>> As soon as I open the following website: >>>>>>>>>> http://www.boston.com/bigpicture/2012/12/2012_year_in_pictures_part_i.html >>>>>>>>>> >>>>>>>>>> my Radeon RS780 stalls (GPU lockup) leaving the machine unusable: >>>>>>>>> Is this a regression? Most likely a 3D driver bug unless you are only >>>>>>>>> seeing it with specific kernels. What browser are you using and do >>>>>>>>> you have hw accelerated webgl, etc. enabled? If so, what version of >>>>>>>>> mesa are you using? >>>>>>>> This is a regression, because it is caused by yesterdays merge of >>>>>>>> drm-next by Linus. IOW I only see this bug when running a >>>>>>>> v3.7-9432-g9360b53 kernel. >>>>>>> Can you bisect? I'm guessing it may be related to the new DMA rings. Possibly: >>>>>>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=2d6cc7296d4ee128ab0fa3b715f0afde511f49c2 >>>>>> Yes, the commit above causes the issue. >>>>>> >>>>>> 2d6cc72 GPU lockups >>>>> With 2d6cc72 reverted I get: >>>>> >>>>> Dec 17 23:09:35 x4 kernel: ------------[ cut here ]------------ >>>> Probably a separate issue, can you bisect this one as well? >>> Yes. Git-bisect points to: >>> >>> 85b144f860176ec18db927d6d9ecdfb24d9c6483 is the first bad commit >>> commit 85b144f860176ec18db927d6d9ecdfb24d9c6483 >>> Author: Maarten Lankhorst <maarten.lankhorst@xxxxxxxxxxxxx> >>> Date: Thu Nov 29 11:36:54 2012 +0000 >>> >>> drm/ttm: call ttm_bo_cleanup_refs with reservation and lru lock >>> held, v3 >>> >>> (Please note that this bug is a little bit harder to reproduce. But >>> when you scroll up and down for ~10 seconds on the webpage mentioned >>> above it will trigger the oops. >>> So while I'm not 100% sure that the issue is caused by exactly this >>> commit, the vicinity should be right) >>> >> Those dmesg warnings sound suspicious, looks like something is going >> very wrong there. >> >> Can you revert the one before it? "drm/radeon: allow move_notify to be >> called without reservation" Reservation should be held at this point, >> that commit got in accidentally. >> >> I doubt not holding a reservation is causing it though, I don't really >> see how that commit could cause it however, so can you please double >> check it never happened before that point, and only started at that >> commit? >> >> also slap in a BUG_ON(!ttm_bo_is_reserved(bo)) in >> ttm_bo_cleanup_refs_and_unlock for good measure, and a >> BUG_ON(spin_trylock(&bdev->fence_lock)); to ttm_bo_wait. >> >> I really don't see how that specific commit can be wrong though, so >> awaiting your results first before I try to dig more into it. > I just reran git-bisect just on your commits (from 1a1494def to 97a875cbd) > and I landed on the same commit as above: > > commit 85b144f86 (drm/ttm: call ttm_bo_cleanup_refs with reservation and lru lock held, v3) > > So now I'm pretty sure it's specifically this commit that started the > issue. > > With your supposed debugging BUG_ONs added I still get: > > Dec 18 17:01:15 x4 kernel: ------------[ cut here ]------------ > Dec 18 17:01:15 x4 kernel: WARNING: at include/linux/kref.h:42 radeon_fence_ref+0x2c/0x40() > Dec 18 17:01:15 x4 kernel: Hardware name: System Product Name > Dec 18 17:01:15 x4 kernel: Pid: 157, comm: X Not tainted 3.7.0-rc7-00520-g85b144f-dirty #174 > Dec 18 17:01:15 x4 kernel: Call Trace: > Dec 18 17:01:15 x4 kernel: [<ffffffff81058c84>] ? warn_slowpath_common+0x74/0xb0 > Dec 18 17:01:15 x4 kernel: [<ffffffff8129273c>] ? radeon_fence_ref+0x2c/0x40 > Dec 18 17:01:15 x4 kernel: [<ffffffff8125e95c>] ? ttm_bo_cleanup_refs_and_unlock+0x18c/0x2d0 > Dec 18 17:01:15 x4 kernel: [<ffffffff8125f17c>] ? ttm_mem_evict_first+0x1dc/0x2a0 > Dec 18 17:01:15 x4 kernel: [<ffffffff81264452>] ? ttm_bo_man_get_node+0x62/0xb0 > Dec 18 17:01:15 x4 kernel: [<ffffffff8125f4ce>] ? ttm_bo_mem_space+0x28e/0x340 > Dec 18 17:01:15 x4 kernel: [<ffffffff8125fb0c>] ? ttm_bo_move_buffer+0xfc/0x170 > Dec 18 17:01:15 x4 kernel: [<ffffffff810de172>] ? kmem_cache_alloc+0xb2/0xc0 > Dec 18 17:01:15 x4 kernel: [<ffffffff8125fc15>] ? ttm_bo_validate+0x95/0x110 > Dec 18 17:01:15 x4 kernel: [<ffffffff8125ff7c>] ? ttm_bo_init+0x2ec/0x3b0 > Dec 18 17:01:15 x4 kernel: [<ffffffff8129419a>] ? radeon_bo_create+0x18a/0x200 > Dec 18 17:01:15 x4 kernel: [<ffffffff81293e80>] ? radeon_bo_clear_va+0x40/0x40 > Dec 18 17:01:15 x4 kernel: [<ffffffff812a5342>] ? radeon_gem_object_create+0x92/0x160 > Dec 18 17:01:15 x4 kernel: [<ffffffff812a575c>] ? radeon_gem_create_ioctl+0x6c/0x150 > Dec 18 17:01:15 x4 kernel: [<ffffffff812a529f>] ? radeon_gem_object_free+0x2f/0x40 > Dec 18 17:01:15 x4 kernel: [<ffffffff81246b60>] ? drm_ioctl+0x420/0x4f0 > Dec 18 17:01:15 x4 kernel: [<ffffffff812a56f0>] ? radeon_gem_pwrite_ioctl+0x20/0x20 > Dec 18 17:01:15 x4 kernel: [<ffffffff810f53a4>] ? do_vfs_ioctl+0x2e4/0x4e0 > Dec 18 17:01:15 x4 kernel: [<ffffffff810e5588>] ? vfs_read+0x118/0x160 > Dec 18 17:01:15 x4 kernel: [<ffffffff810f55ec>] ? sys_ioctl+0x4c/0xa0 > Dec 18 17:01:15 x4 kernel: [<ffffffff810e5851>] ? sys_read+0x51/0xa0 > Dec 18 17:01:15 x4 kernel: [<ffffffff814b0612>] ? system_call_fastpath+0x16/0x1b So nothing changed.. did you revert the drm/radeon patch before it yet? And wtf is going on here? That patch shouldn't cause such issues by itself, and I don't see how the refcount on bo->sync_obj can be zero, with bo->sync_obj non-null. Refcounting seems to be messed up on the fence somewhere, but I don't think it's caused by this patch.. ~Maarten _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel