On Mon, Dec 19, 2016 at 12:39:16PM +0100, Juergen Gross wrote: > With recent 4.10 kernel the graphics isn't coming up under Xen. First > failure message is: > > [ 46.656649] i915 0000:00:02.0: swiotlb buffer is full (sz: 1630208 bytes) Do we get a silent failure? i915_gem_gtt_prepare_pages() is where we call dma_map_sg() and pass the sg to swiotlb (in this case) for remapping, and we do check for an error value of 0. After that error, SWIOTLB_MAP_ERROR is propagated back and converted to 0 for dma_map_sg(). That looks valid, and we should report ENOMEM back to the caller. > Later I see splats like: > > [ 49.393583] general protection fault: 0000 [#1] SMP What was the faulting address? RAX is particularly non-pointer-like so I wonder if we walked onto an uninitialised portion of the sgtable. We may have tripped over a bug in our sg_page iterator. The attached patch should prevent an early ENOMEM following the swiotlb allocation failure. But I suspect that we will still be tripping up the failure in the sg walker when binding to the GPU. -Chris -- Chris Wilson, Intel Open Source Technology Centre
>From e3f9268d467768a31e19d21e2f45e5c9ddd9a0f8 Mon Sep 17 00:00:00 2001 From: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> Date: Mon, 19 Dec 2016 12:23:43 +0000 Subject: [PATCH] drm/i915: Fallback to single PAGE_SIZE segments for DMA remapping If we at first do not succeed with attempting to remap our physical pages using a coalesced scattergather list, try again with one scattergather entry per page. This should help with swiotlb as it uses a limited buffer size and only searches for contiguous chunks within its buffer aligned up to the next boundary - i.e. we may prematurely cause a failure as we are unable to utilize the unused space between large chunks and trigger an error such as: i915 0000:00:02.0: swiotlb buffer is full (sz: 1630208 bytes) Reported-by: Juergen Gross <jgross@xxxxxxxx> Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> --- drivers/gpu/drm/i915/i915_gem.c | 22 ++++++++++++++++++---- 1 file changed, 18 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 412f3513f269..509d98887e04 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2326,7 +2326,8 @@ static struct sg_table * i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) { struct drm_i915_private *dev_priv = to_i915(obj->base.dev); - int page_count, i; + const unsigned long page_count = obj->base.size / PAGE_SIZE; + unsigned long i; struct address_space *mapping; struct sg_table *st; struct scatterlist *sg; @@ -2352,7 +2353,7 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) if (st == NULL) return ERR_PTR(-ENOMEM); - page_count = obj->base.size / PAGE_SIZE; +rebuild_st: if (sg_alloc_table(st, page_count, GFP_KERNEL)) { kfree(st); return ERR_PTR(-ENOMEM); @@ -2411,8 +2412,21 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) i915_sg_trim(st); ret = i915_gem_gtt_prepare_pages(obj, st); - if (ret) - goto err_pages; + if (ret) { + /* DMA remapping failed? One possible cause is that + * it could not reserve enough large entries, asking + * for PAGE_SIZE chunks may be helpful. + */ + if (max_segment > PAGE_SIZE) { + for_each_sgt_page(page, sgt_iter, st) + put_page(page); + sg_free_table(st); + + max_segment = PAGE_SIZE; + goto rebuild_st; + } else + goto err_pages; + } if (i915_gem_object_needs_bit17_swizzle(obj)) i915_gem_object_do_bit_17_swizzle(obj, st); -- 2.11.0
_______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx