Re: [PATCH 11/42] drm/i915: Introduce an internal allocator for disposable private objects

Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> · Fri, 7 Oct 2016 18:08:01 +0100

On Fri, Oct 07, 2016 at 05:52:47PM +0100, Tvrtko Ursulin wrote:
> 
> On 07/10/2016 10:46, Chris Wilson wrote:
> >Quite a few of our objects used for internal hardware programming do not
> >benefit from being swappable or from being zero initialised. As such
> >they do not benefit from using a shmemfs backing storage and since they
> >are internal and never directly exposed to the user, we do not need to
> >worry about providing a filp. For these we can use an
> >drm_i915_gem_object wrapper around a sg_table of plain struct page. They
> >are not swap backed and not automatically pinned. If they are reaped
> >by the shrinker, the pages are released and the contents discarded. For
> >the internal use case, this is fine as for example, ringbuffers are
> >pinned from being written by a request to be read by the hardware. Once
> >they are idle, they can be discarded entirely. As such they are a good
> >match for execlist ringbuffers and a small variety of other internal
> >objects.
> >
> >In the first iteration, this is limited to the scratch batch buffers we
> >use (for command parsing and state initialisation).
> >
> >Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
> >---
> >  drivers/gpu/drm/i915/Makefile                |   1 +
> >  drivers/gpu/drm/i915/i915_drv.h              |   5 +
> >  drivers/gpu/drm/i915/i915_gem_batch_pool.c   |  28 ++---
> >  drivers/gpu/drm/i915/i915_gem_internal.c     | 161 +++++++++++++++++++++++++++
> >  drivers/gpu/drm/i915/i915_gem_render_state.c |   2 +-
> >  drivers/gpu/drm/i915/intel_engine_cs.c       |   2 +-
> >  drivers/gpu/drm/i915/intel_ringbuffer.c      |  14 ++-
> >  7 files changed, 189 insertions(+), 24 deletions(-)
> >  create mode 100644 drivers/gpu/drm/i915/i915_gem_internal.c
> >
> >diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> >index a998c2bce70a..b94a90f34d2d 100644
> >--- a/drivers/gpu/drm/i915/Makefile
> >+++ b/drivers/gpu/drm/i915/Makefile
> >@@ -35,6 +35,7 @@ i915-y += i915_cmd_parser.o \
> >  	  i915_gem_execbuffer.o \
> >  	  i915_gem_fence.o \
> >  	  i915_gem_gtt.o \
> >+	  i915_gem_internal.o \
> >  	  i915_gem.o \
> >  	  i915_gem_render_state.o \
> >  	  i915_gem_request.o \
> >diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> >index fee5cc92e2f2..bad97f1e5265 100644
> >--- a/drivers/gpu/drm/i915/i915_drv.h
> >+++ b/drivers/gpu/drm/i915/i915_drv.h
> >@@ -3538,6 +3538,11 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> >  					       u32 gtt_offset,
> >  					       u32 size);
> >+/* i915_gem_internal.c */
> >+struct drm_i915_gem_object *
> >+i915_gem_object_create_internal(struct drm_device *dev,
> >+				unsigned int size);
> >+
> 
> Wasn't size_t our convention for GEM objects?

Not our convention, no. Using size_t has caused too many bugs, that we
started to erradicate it (in our usual piecemeal approach).

> >+		/* 965gm cannot relocate objects above 4GiB. */
> >+		gfp &= ~__GFP_HIGHMEM;
> >+		gfp |= __GFP_DMA32;
> >+	}
> >+
> >+	for (i = 0; i < npages; i++) {
> >+		struct page *page;
> >+
> >+		page = alloc_page(gfp);
> >+		if (!page)
> >+			goto err;
> >+
> >+#ifdef CONFIG_SWIOTLB
> >+		if (swiotlb_nr_tbl()) {
> >+			st->nents++;
> >+			sg_set_page(sg, page, PAGE_SIZE, 0);
> >+			sg = sg_next(sg);
> >+			continue;
> >+		}
> >+#endif
> >+		if (!i || page_to_pfn(page) != last_pfn + 1) {
> >+			if (i)
> >+				sg = sg_next(sg);
> >+			st->nents++;
> >+			sg_set_page(sg, page, PAGE_SIZE, 0);
> >+		} else {
> >+			sg->length += PAGE_SIZE;
> >+		}
> >+		last_pfn = page_to_pfn(page);
> >+	}
> >+#ifdef CONFIG_SWIOTLB
> >+	if (!swiotlb_nr_tbl())
> >+#endif
> >+		sg_mark_end(sg);
> 
> Looks like the loop above could be moved into a helper and shared
> with i915_gem_object_get_pages_gtt. Maybe just a page-alloc and
> page-alloc-error callbacks would be required.

So just the entire thing as a callback... I would have thought you might
suggest trying high order allocations and falling back to low order.
-Chirs

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx