Re: [PATCH 2/3] drm/i915: Infrastructure for supporting different GGTT views per object

Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> · Thu, 4 Dec 2014 12:27:53 +0000



On Thu, Dec 04, 2014 at 12:17:51PM +0000, Tvrtko Ursulin wrote:
> On 12/04/2014 10:59 AM, Daniel Vetter wrote:
> >On Thu, Dec 04, 2014 at 10:26:14AM +0000, Chris Wilson wrote:
> >>On Thu, Dec 04, 2014 at 10:19:09AM +0000, Tvrtko Ursulin wrote:
> >>>
> >>>On 12/04/2014 09:53 AM, Chris Wilson wrote:
> >>>>On Wed, Dec 03, 2014 at 02:59:25PM +0000, Tvrtko Ursulin wrote:
> >>>>>+void i915_vma_bind(struct i915_vma *vma, enum i915_cache_level cache_level,
> >>>>>+		   u32 flags)
> >>>>>+{
> >>>>>+	struct sg_table *pages = i915_ggtt_view_pages(vma);
> >>>>>+
> >>>>>+	if (pages && !IS_ERR(pages)) {
> >>>>>+		vma->bind_vma(vma, pages, cache_level, flags);
> >>>>>+
> >>>>>+		if (vma->ggtt_view.type != I915_GGTT_VIEW_NORMAL) {
> >>>>>+			sg_free_table(pages);
> >>>>>+			kfree(pages);
> >>>>>+		}
> >>>>>+	}
> >>>>>+}
> >>>>
> >>>>Stop. Even if the failure path is impossible with the present
> >>>>implementation, here you are masking the error only to go and pretend
> >>>>the binding succeeded.
> >>>>
> >>>>Don't be lazy, this is a very nasty bug that should be hit during igt -
> >>>>or else you are not testing well enough.
> >>>
> >>>Fair comment, even if a bit too assuming. I actually had this as
> >>>TODO but somehow lost it.
> >>>
> >>>I don't have any ideas on how to provoke this to fail from an IGT?
> >>>Even with future implementations it boils down to a couple of small
> >>>allocations which would have to fail reliably.
> >>
> >>We have quite a few thrash tests now that are fairly good at getting
> >>even the small allocations to fail.
> >>
> >>What we don't have is a single-fd, multi-ctx thrash test (well except
> >>for some GL tests...)
> >
> >But none of these tests result in permanent memory failures (only the
> >occasional ioctl restart when waiting for gpu rendering). And sg table
> >alloc only recurses through the shrinker so that can't happen. So I think
> >we just have to get by with review.
> >
> >We did have issues with sg table allocations in stress tests though,
> >before we've added the recursive shrinker locking, hence sg table alloc
> >can indeed go south.
> 
> I looked at propagating errors from i915_vma_bind() out to callers
> and it is mostly all fine apart from the
> i915_gem_restore_gtt_mappings during i915_drm_resume.
> 
> I don't see how this is fixable apart by going back and having sgls
> stay around for the lifetime of their VMAs. It shouldn't be such a
> big deal - they are not so big even with non-coalesced entries.

The good news is that for a rebind, you already have the sg lists
already allocated and converted to dma addresses, and so you can just do
something like:

static inline void
i915_vma_rebind(struct i915_vma *vma, enum i915_cache_level cache_level)
{
        unsigned rebind;

        rebind = vma->bound;
        vma->bound = 0;

        WARN_ON(vma->bind_vma(vma, cache_level, rebind));
}
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx