On Tue, Apr 07, 2015 at 04:20:25PM +0100, Chris Wilson wrote: > The biggest user of i915_gem_object_get_page() is the relocation > processing during execbuffer. Typically userspace passes in a set of > relocations in sorted order. Sadly, we alternate between relocations > increasing from the start of the buffers, and relocations decreasing > from the end. However the majority of consecutive lookups will still be > in the same page. We could cache the start of the last sg chain, however > for most callers, the entire sgl is inside a single chain and so we see > no improve from the extra layer of caching. > > v2: Avoid the double increment inside unlikely() > > References: https://bugs.freedesktop.org/show_bug.cgi?id=88308 > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > Cc: John Harrison <John.C.Harrison@xxxxxxxxx> Indeed this makes gem_exec_big a lot faster. Queued for -next, thanks for the patch. -Daniel > --- > drivers/gpu/drm/i915/i915_drv.h | 31 ++++++++++++++++++++++++++----- > drivers/gpu/drm/i915/i915_gem.c | 4 ++++ > 2 files changed, 30 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index 4f5dae9a23f9..51b21483b95f 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -1987,6 +1987,10 @@ struct drm_i915_gem_object { > > struct sg_table *pages; > int pages_pin_count; > + struct get_page { > + struct scatterlist *sg; > + int last; > + } get_page; > > /* prime dma-buf support */ > void *dma_buf_vmapping; > @@ -2665,15 +2669,32 @@ int i915_gem_obj_prepare_shmem_read(struct drm_i915_gem_object *obj, > int *needs_clflush); > > int __must_check i915_gem_object_get_pages(struct drm_i915_gem_object *obj); > -static inline struct page *i915_gem_object_get_page(struct drm_i915_gem_object *obj, int n) > + > +static inline int __sg_page_count(struct scatterlist *sg) > +{ > + return sg->length >> PAGE_SHIFT; > +} > + > +static inline struct page * > +i915_gem_object_get_page(struct drm_i915_gem_object *obj, int n) > { > - struct sg_page_iter sg_iter; > + if (WARN_ON(n >= obj->base.size >> PAGE_SHIFT)) > + return NULL; > > - for_each_sg_page(obj->pages->sgl, &sg_iter, obj->pages->nents, n) > - return sg_page_iter_page(&sg_iter); > + if (n < obj->get_page.last) { > + obj->get_page.sg = obj->pages->sgl; > + obj->get_page.last = 0; > + } > + > + while (obj->get_page.last + __sg_page_count(obj->get_page.sg) <= n) { > + obj->get_page.last += __sg_page_count(obj->get_page.sg++); > + if (unlikely(sg_is_chain(obj->get_page.sg))) > + obj->get_page.sg = sg_chain_ptr(obj->get_page.sg); > + } > > - return NULL; > + return nth_page(sg_page(obj->get_page.sg), n - obj->get_page.last); > } > + > static inline void i915_gem_object_pin_pages(struct drm_i915_gem_object *obj) > { > BUG_ON(obj->pages == NULL); > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index be4f2645b637..567affeafec4 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -2178,6 +2178,10 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj) > return ret; > > list_add_tail(&obj->global_list, &dev_priv->mm.unbound_list); > + > + obj->get_page.sg = obj->pages->sgl; > + obj->get_page.last = 0; > + > return 0; > } > > -- > 2.1.4 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@xxxxxxxxxxxxxxxxxxxxx > http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx