On 09/12/15 18:50, yu.dai@xxxxxxxxx wrote:
From: Alex Dai <yu.dai@xxxxxxxxx> Split GuC work queue space reserve from submission and move it to ring_alloc_request_extras. The reason is that failure in later i915_add_request() won't be handled. In the case timeout happens, driver can return early in order to handle the error. v1: Move wq_reserve_space to ring_reserve_space v2: Move wq_reserve_space to alloc_request_extras (Chris Wilson) Signed-off-by: Alex Dai <yu.dai@xxxxxxxxx> --- drivers/gpu/drm/i915/i915_guc_submission.c | 21 +++++++++------------ drivers/gpu/drm/i915/intel_guc.h | 1 + drivers/gpu/drm/i915/intel_lrc.c | 6 ++++++ 3 files changed, 16 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index 226e9c0..f7bd038 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -472,25 +472,21 @@ static void guc_fini_ctx_desc(struct intel_guc *guc, sizeof(desc) * client->ctx_index); } -/* Get valid workqueue item and return it back to offset */ -static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset) +int i915_guc_wq_reserve_space(struct i915_guc_client *gc)
I think the name is misleading, because we don't actually reserve anything here, just check that there is some free space in the WQ.
(We certainly don't WANT to reserve anything, because that would be difficult to clean up in the event of submission failure for any other reason. So I think it's only the name needs changing. Although ...
{ struct guc_process_desc *desc; void *base; u32 size = sizeof(struct guc_wq_item); int ret = -ETIMEDOUT, timeout_counter = 200; + if (!gc) + return 0; + base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0)); desc = base + gc->proc_desc_offset; while (timeout_counter-- > 0) { if (CIRC_SPACE(gc->wq_tail, desc->head, gc->wq_size) >= size) {
... as an alternative strategy, we could cache the calculated freespace in the client structure; then if we already know there's at least 1 slot free from last time we checked, we could then just decrement the cached value and avoid the kmap+spinwait overhead. Only when we reach 0 would we have to go through this code to refresh our view of desc->head and recalculate the actual current freespace. [NB: clear cached value on reset?]
Does that sound like a useful optimisation?
- *offset = gc->wq_tail; - - /* advance the tail for next workqueue item */ - gc->wq_tail += size; - gc->wq_tail &= gc->wq_size - 1; - /* this will break the loop */ timeout_counter = 0; ret = 0; @@ -512,11 +508,12 @@ static int guc_add_workqueue_item(struct i915_guc_client *gc, struct guc_wq_item *wqi; void *base; u32 tail, wq_len, wq_off = 0; - int ret; - ret = guc_get_workqueue_space(gc, &wq_off); - if (ret) - return ret; + wq_off = gc->wq_tail; + + /* advance the tail for next workqueue item */ + gc->wq_tail += sizeof(struct guc_wq_item); + gc->wq_tail &= gc->wq_size - 1;
I was a bit unhappy about this code just assuming that there *must* be space (because we KNOW we've checked above) -- unless someone violated the proper calling sequence (TDR?). OTOH, it would be too expensive to go through the map-and-calculate code all over again just to catch an unlikely scenario. But, if we cache the last-calculated value as above, then the check could be cheap :) For example, just add a pre_checked size field that's set by the pre-check and then checked and decremented on submission; there shouldn't be more than one submission in progress at a time, because dev->struct_mutex is held across the whole sequence (but it's not an error to see two pre-checks in a row, because a request can be abandoned partway).
/* For now workqueue item is 4 DWs; workqueue buffer is 2 pages. So we * should not have the case where structure wqi is across page, neither diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h index 0e048bf..59c8e21 100644 --- a/drivers/gpu/drm/i915/intel_guc.h +++ b/drivers/gpu/drm/i915/intel_guc.h @@ -123,5 +123,6 @@ int i915_guc_submit(struct i915_guc_client *client, struct drm_i915_gem_request *rq); void i915_guc_submission_disable(struct drm_device *dev); void i915_guc_submission_fini(struct drm_device *dev); +int i915_guc_wq_reserve_space(struct i915_guc_client *client); #endif diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index f96fb51..7d53d27 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -667,6 +667,12 @@ int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request return ret; } + /* Reserve GuC WQ space here (one request needs one WQ item) because + * the later i915_add_request() call can't fail. */ + ret = i915_guc_wq_reserve_space(request->i915->guc.execbuf_client); + if (ret) + return ret; + return 0; }
Worth checking for GuC submission before that call? Maybe not ... .Dave. _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx