Re: [PATCH v2] drm/i915/guc: Move GuC wq_reserve_space to alloc_request_extras

Dave Gordon <david.s.gordon@xxxxxxxxx> · Thu, 10 Dec 2015 17:14:28 +0000

On 09/12/15 18:50, yu.dai@xxxxxxxxx wrote:
From: Alex Dai <yu.dai@xxxxxxxxx>

Split GuC work queue space reserve from submission and move it to
ring_alloc_request_extras. The reason is that failure in later
i915_add_request() won't be handled. In the case timeout happens,
driver can return early in order to handle the error.

v1: Move wq_reserve_space to ring_reserve_space
v2: Move wq_reserve_space to alloc_request_extras (Chris Wilson)

Signed-off-by: Alex Dai <yu.dai@xxxxxxxxx>
---
  drivers/gpu/drm/i915/i915_guc_submission.c | 21 +++++++++------------
  drivers/gpu/drm/i915/intel_guc.h           |  1 +
  drivers/gpu/drm/i915/intel_lrc.c           |  6 ++++++
  3 files changed, 16 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 226e9c0..f7bd038 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -472,25 +472,21 @@ static void guc_fini_ctx_desc(struct intel_guc *guc,
  			     sizeof(desc) * client->ctx_index);
  }

-/* Get valid workqueue item and return it back to offset */
-static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset)
+int i915_guc_wq_reserve_space(struct i915_guc_client *gc)

I think the name is misleading, because we don't actually reserve 
anything here, just check that there is some free space in the WQ.

(We certainly don't WANT to reserve anything, because that would be 
difficult to clean up in the event of submission failure for any other 
reason. So I think it's only the name needs changing. Although ...

  {
  	struct guc_process_desc *desc;
  	void *base;
  	u32 size = sizeof(struct guc_wq_item);
  	int ret = -ETIMEDOUT, timeout_counter = 200;

+	if (!gc)
+		return 0;
+
  	base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
  	desc = base + gc->proc_desc_offset;

  	while (timeout_counter-- > 0) {
  		if (CIRC_SPACE(gc->wq_tail, desc->head, gc->wq_size) >= size) {

... as an alternative strategy, we could cache the calculated freespace 
in the client structure; then if we already know there's at least 1 slot 
free from last time we checked, we could then just decrement the cached 
value and avoid the kmap+spinwait overhead. Only when we reach 0 would 
we have to go through this code to refresh our view of desc->head and 
recalculate the actual current freespace. [NB: clear cached value on reset?]

Does that sound like a useful optimisation?

-			*offset = gc->wq_tail;
-
-			/* advance the tail for next workqueue item */
-			gc->wq_tail += size;
-			gc->wq_tail &= gc->wq_size - 1;
-
  			/* this will break the loop */
  			timeout_counter = 0;
  			ret = 0;
@@ -512,11 +508,12 @@ static int guc_add_workqueue_item(struct i915_guc_client *gc,
  	struct guc_wq_item *wqi;
  	void *base;
  	u32 tail, wq_len, wq_off = 0;
-	int ret;

-	ret = guc_get_workqueue_space(gc, &wq_off);
-	if (ret)
-		return ret;
+	wq_off = gc->wq_tail;
+
+	/* advance the tail for next workqueue item */
+	gc->wq_tail += sizeof(struct guc_wq_item);
+	gc->wq_tail &= gc->wq_size - 1;

I was a bit unhappy about this code just assuming that there *must* be 
space (because we KNOW we've checked above) -- unless someone violated 
the proper calling sequence (TDR?). OTOH, it would be too expensive to 
go through the map-and-calculate code all over again just to catch an 
unlikely scenario. But, if we cache the last-calculated value as above, 
then the check could be cheap :) For example, just add a pre_checked 
size field that's set by the pre-check and then checked and decremented 
on submission; there shouldn't be more than one submission in progress 
at a time, because dev->struct_mutex is held across the whole sequence 
(but it's not an error to see two pre-checks in a row, because a request 
can be abandoned partway).

  	/* For now workqueue item is 4 DWs; workqueue buffer is 2 pages. So we
  	 * should not have the case where structure wqi is across page, neither
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 0e048bf..59c8e21 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -123,5 +123,6 @@ int i915_guc_submit(struct i915_guc_client *client,
  		    struct drm_i915_gem_request *rq);
  void i915_guc_submission_disable(struct drm_device *dev);
  void i915_guc_submission_fini(struct drm_device *dev);
+int i915_guc_wq_reserve_space(struct i915_guc_client *client);

  #endif
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index f96fb51..7d53d27 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -667,6 +667,12 @@ int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request
  			return ret;
  	}

+	/* Reserve GuC WQ space here (one request needs one WQ item) because
+	 * the later i915_add_request() call can't fail. */
+	ret = i915_guc_wq_reserve_space(request->i915->guc.execbuf_client);
+	if (ret)
+		return ret;
+
  	return 0;
  }

Worth checking for GuC submission before that call? Maybe not ...

.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx