Re: [Intel-gfx] [PATCH 1/3] drm/i915: Hold an object reference whilst we shrink it

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Sep 04, 2013 at 10:45:50AM +0100, Chris Wilson wrote:
> Whilst running the shrinker, we need to hold a reference as we unbind
> the objects, or else we may end up waiting for and retiring requests,
> which in turn may result in this object being freed.
> 
> This is very similar to the eviction code which also has to be very
> careful to keep a reference to its objects as it retires and unbinds
> them.
> 
> Another similarity, that Ben pointed out, is that as we may call
> retire-requests, the unbound_list is outside of our control. We must
> only process a single element of that list at a time, that is we can not
> rely on the "safe" next pointer being valid after a call to
> i915_vma_unbind().
> 
>   BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
>   IP: [<ffffffffa0082892>] i915_gem_gtt_finish_object+0x68/0xbd [i915]
>   PGD 758d3067 PUD ac0d6067 PMD 0
>   Oops: 0000 [#1] SMP
>   Modules linked in: dm_mod snd_hda_codec_realtek iTCO_wdt iTCO_vendor_support pcspkr snd_hda_intel i2c_i801 snd_hda_codec snd_hwdep snd_pcm snd_page_alloc snd_timer snd lpc_ich mfd_core soundcore battery ac option usb_wwan usbserial uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core videodev i915 video button drm_kms_helper drm acpi_cpufreq mperf freq_table
>   CPU: 1 PID: 16835 Comm: fbo-maxsize Not tainted 3.11.0-rc7_nightlytop_8fdad4_20130902_+ #7977
>   task: ffff8800712106d0 ti: ffff880028e4a000 task.ti: ffff880028e4a000
>   RIP: 0010:[<ffffffffa0082892>]  [<ffffffffa0082892>] i915_gem_gtt_finish_object+0x68/0xbd [i915]
>   RSP: 0018:ffff880028e4b9e8  EFLAGS: 00010246
>   RAX: 0000000000000000 RBX: ffff880145734000 RCX: ffff880145735328
>   RDX: ffff8801457353fc RSI: 0000000000000000 RDI: ffff88007597cc00
>   RBP: ffff88007597cc00 R08: 0000000000000001 R09: ffff88014f257f00
>   R10: ffffea0001d65f00 R11: 0000000000bba60b R12: ffff880149e5b000
>   R13: ffff880145734001 R14: ffff88007597ccc8 R15: ffff88007597cc00
>   FS:  00007ff5bc919740(0000) GS:ffff88014f240000(0000) knlGS:0000000000000000
>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   CR2: 0000000000000008 CR3: 0000000028f4c000 CR4: 00000000001407e0
>   DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>   DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>   Stack:
>    0000000000000000 ffff88007597cc00 ffff8801440d6840 0000000000000000
>    ffff880145734000 ffffffffa007c854 0000000000000010 ffff88007597c900
>    0000000000018000 00000000004a1201 ffff88007597cc60 ffffffffa007d183
>   Call Trace:
>    [<ffffffffa007c854>] ? i915_vma_unbind+0xe2/0x1d1 [i915]
>    [<ffffffffa007d183>] ? __i915_gem_shrink+0xf1/0x162 [i915]
>    [<ffffffffa007d2ee>] ? i915_gem_object_get_pages_gtt+0xfa/0x303 [i915]
>    [<ffffffffa00795f4>] ? i915_gem_object_get_pages+0x54/0x89 [i915]
>    [<ffffffffa007cbda>] ? i915_gem_object_pin+0x238/0x5ce [i915]
>    [<ffffffff812cba5f>] ? __sg_page_iter_next+0x2b/0x58
>    [<ffffffffa0082056>] ? gen6_ppgtt_insert_entries+0xf2/0x114 [i915]
>    [<ffffffffa007fe4b>] ? i915_gem_execbuffer_reserve_vma.isra.13+0x79/0x18d [i915]
>    [<ffffffffa008017c>] ? i915_gem_execbuffer_reserve+0x21d/0x347 [i915]
>    [<ffffffffa0080bfb>] ? i915_gem_do_execbuffer.isra.17+0x4f3/0xe61 [i915]
>    [<ffffffffa00795f4>] ? i915_gem_object_get_pages+0x54/0x89 [i915]
>    [<ffffffffa007e405>] ? i915_gem_pwrite_ioctl+0x743/0x7a5 [i915]
>    [<ffffffffa0081a46>] ? i915_gem_execbuffer2+0x15e/0x1e4 [i915]
>    [<ffffffffa000e20d>] ? drm_ioctl+0x2a5/0x3c4 [drm]
>    [<ffffffffa00818e8>] ? i915_gem_execbuffer+0x37f/0x37f [i915]
>    [<ffffffff816f64c0>] ? __do_page_fault+0x3ab/0x449
>    [<ffffffff810be3da>] ? do_mmap_pgoff+0x2b2/0x341
>    [<ffffffff810e49be>] ? vfs_ioctl+0x1e/0x31
>    [<ffffffff810e5194>] ? do_vfs_ioctl+0x3ad/0x3ef
>    [<ffffffff810e5224>] ? SyS_ioctl+0x4e/0x7e
>    [<ffffffff816f88d2>] ? system_call_fastpath+0x16/0x1b
>   Code: 52 0c a0 48 c7 c6 22 30 0d a0 31 c0 e8 ef 00 f9 ff bf c6 a7 00 00 e8 90 5d 24 e1 f6 85 13 01 00 00 10 75 44 48 8b 85 18 01 00 00 <8b> 50 08 48 8b 30 49 8b 84 24 88 02 00 00 48 89 c7 48 81 c7 98
>   RIP  [<ffffffffa0082892>] i915_gem_gtt_finish_object+0x68/0xbd [i915]
>   RSP <ffff880028e4b9e8>
>   CR2: 0000000000000008
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68171
> Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
> Cc: stable@xxxxxxxxxxxxxxx
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 31 +++++++++++++++++++++++++------
>  1 file changed, 25 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index fdeecae..c31e0b2 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1695,6 +1695,7 @@ static long
>  __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
>  		  bool purgeable_only)
>  {
> +	struct list_head still_bound_list;
>  	struct drm_i915_gem_object *obj, *next;
>  	long count = 0;
>  
> @@ -1709,23 +1710,41 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
>  		}
>  	}
>  
> -	list_for_each_entry_safe(obj, next, &dev_priv->mm.bound_list,
> -				 global_list) {
> +	/* As we may completely rewrite the bound list whilst unbinding
> +	 * (due to retiring requests) we have to strictly process only
> +	 * one element of the list at the time, and recheck the list
> +	 * on every iteration.
> +	 */
> +	INIT_LIST_HEAD(&still_bound_list);
> +	while (count < target && !list_empty(&dev_priv->mm.bound_list)) {
>  		struct i915_vma *vma, *v;
>  
> +		obj = list_first_entry(&dev_priv->mm.bound_list,
> +				       typeof(*obj), global_list);
> +		list_move_tail(&obj->global_list, &still_bound_list);
> +
>  		if (!i915_gem_object_is_purgeable(obj) && purgeable_only)
>  			continue;
>  
> +		/* Hold a reference whilst we unbind this object, as we may
> +		 * end up waiting for and retiring requests, which may result
> +		 * in this object being freed.
> +		 *
> +		 * Note that the shrinker and eviction is special as they operate
> +		 * on the inactive lists which reference limbo objects.
> +		 */

This confused me a bit (imo makes only sense with an s/inactive/active/).
And I think this needs a some clarification to make it clear why we don't need
this same trickery for the unbound list. What about this instead?

		/*
		 * Hold a reference whilst we unbind this object, as we may
		 * end up waiting for and retiring requests. This might
		 * release the final reference (held by the active list)
		 * and result in the object being freed from under us.
		 * in this object being freed.
		 * 
		 * Note that only shrinking the bound list is special
		 * since only active (and hence bound objects) can contain
		 * such limbo objects.
		 */


> +		drm_gem_object_reference(&obj->base);

I also think we need a comment here explaining why it is safe to grab a
reference despite that the bound (or unbound list fwiw) doesn't hold a
reference itself. Ok if I add

		/*
		 * Note: Even though the bound list doesn't hold a
		 * reference to the object we can savely grab one here:
		 * The final object unreferencing and the bound_list are
		 * both protected by the dev->struct_mutex and so we won't
		 * ever be able to observe an object on the bound_list
		 * with a reference count equals 0.
		 */

when applying?

Cheers, Daniel
> +
>  		list_for_each_entry_safe(vma, v, &obj->vma_list, vma_link)
>  			if (i915_vma_unbind(vma))
>  				break;
>  
> -		if (!i915_gem_object_put_pages(obj)) {
> +		if (i915_gem_object_put_pages(obj) == 0)
>  			count += obj->base.size >> PAGE_SHIFT;
> -			if (count >= target)
> -				return count;
> -		}
> +
> +		drm_gem_object_unreference(&obj->base);
>  	}
> +	list_splice(&still_bound_list, &dev_priv->mm.bound_list);
>  
>  	return count;
>  }
> -- 
> 1.8.4.rc3
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]