On Thu, Dec 04, 2014 at 12:17:51PM +0000, Tvrtko Ursulin wrote: > On 12/04/2014 10:59 AM, Daniel Vetter wrote: > >On Thu, Dec 04, 2014 at 10:26:14AM +0000, Chris Wilson wrote: > >>On Thu, Dec 04, 2014 at 10:19:09AM +0000, Tvrtko Ursulin wrote: > >>> > >>>On 12/04/2014 09:53 AM, Chris Wilson wrote: > >>>>On Wed, Dec 03, 2014 at 02:59:25PM +0000, Tvrtko Ursulin wrote: > >>>>>+void i915_vma_bind(struct i915_vma *vma, enum i915_cache_level cache_level, > >>>>>+ u32 flags) > >>>>>+{ > >>>>>+ struct sg_table *pages = i915_ggtt_view_pages(vma); > >>>>>+ > >>>>>+ if (pages && !IS_ERR(pages)) { > >>>>>+ vma->bind_vma(vma, pages, cache_level, flags); > >>>>>+ > >>>>>+ if (vma->ggtt_view.type != I915_GGTT_VIEW_NORMAL) { > >>>>>+ sg_free_table(pages); > >>>>>+ kfree(pages); > >>>>>+ } > >>>>>+ } > >>>>>+} > >>>> > >>>>Stop. Even if the failure path is impossible with the present > >>>>implementation, here you are masking the error only to go and pretend > >>>>the binding succeeded. > >>>> > >>>>Don't be lazy, this is a very nasty bug that should be hit during igt - > >>>>or else you are not testing well enough. > >>> > >>>Fair comment, even if a bit too assuming. I actually had this as > >>>TODO but somehow lost it. > >>> > >>>I don't have any ideas on how to provoke this to fail from an IGT? > >>>Even with future implementations it boils down to a couple of small > >>>allocations which would have to fail reliably. > >> > >>We have quite a few thrash tests now that are fairly good at getting > >>even the small allocations to fail. > >> > >>What we don't have is a single-fd, multi-ctx thrash test (well except > >>for some GL tests...) > > > >But none of these tests result in permanent memory failures (only the > >occasional ioctl restart when waiting for gpu rendering). And sg table > >alloc only recurses through the shrinker so that can't happen. So I think > >we just have to get by with review. > > > >We did have issues with sg table allocations in stress tests though, > >before we've added the recursive shrinker locking, hence sg table alloc > >can indeed go south. > > I looked at propagating errors from i915_vma_bind() out to callers > and it is mostly all fine apart from the > i915_gem_restore_gtt_mappings during i915_drm_resume. > > I don't see how this is fixable apart by going back and having sgls > stay around for the lifetime of their VMAs. It shouldn't be such a > big deal - they are not so big even with non-coalesced entries. The good news is that for a rebind, you already have the sg lists already allocated and converted to dma addresses, and so you can just do something like: static inline void i915_vma_rebind(struct i915_vma *vma, enum i915_cache_level cache_level) { unsigned rebind; rebind = vma->bound; vma->bound = 0; WARN_ON(vma->bind_vma(vma, cache_level, rebind)); } -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx