On Thu, Nov 03, 2016 at 11:03:47AM +0000, Tvrtko Ursulin wrote: > > On 02/11/2016 17:50, Chris Wilson wrote: > >The scheduler needs to know the dependencies of each request for the > >lifetime of the request, as it may choose to reschedule the requests at > >any time and must ensure the dependency tree is not broken. This is in > >additional to using the fence to only allow execution after all > >dependencies have been completed. > > > >One option was to extend the fence to support the bidirectional > >dependency tracking required by the scheduler. However the mismatch in > >lifetimes between the submit fence and the request essentially meant > >that we had to build a completely separate struct (and we could not > >simply reuse the existing waitqueue in the fence for one half of the > >dependency tracking). The extra dependency tracking simply did not mesh > >well with the fence, and keeping it separate both keeps the fence > >implementation simpler and allows us to extend the dependency tracking > >into a priority tree (whilst maintaining support for reordering the > >tree). > > > >To avoid the additional allocations and list manipulations, the use of > >the priotree is disabled when there are no schedulers to use it. > > > >Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > >--- > > drivers/gpu/drm/i915/i915_gem_request.c | 72 ++++++++++++++++++++++++++++++++- > > drivers/gpu/drm/i915/i915_gem_request.h | 23 +++++++++++ > > 2 files changed, 94 insertions(+), 1 deletion(-) > > > >diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c > >index 9c8605c834f9..13090f226203 100644 > >--- a/drivers/gpu/drm/i915/i915_gem_request.c > >+++ b/drivers/gpu/drm/i915/i915_gem_request.c > >@@ -113,6 +113,59 @@ i915_gem_request_remove_from_client(struct drm_i915_gem_request *request) > > spin_unlock(&file_priv->mm.lock); > > } > > > >+static int > >+i915_priotree_add_dependency(struct i915_priotree *pt, > >+ struct i915_priotree *signal, > >+ struct i915_dependency *dep) > >+{ > >+ unsigned long flags = 0; > >+ > >+ if (!dep) { > >+ dep = kmalloc(sizeof(*dep), GFP_KERNEL); > > I will mention a dedicated cache again since this could possibly be > our hottest allocation path. With a dedicated slab I've seen it grow > to 5-7k objects in some benchmarks, with the request slab around 1k > at the same time. I'm open to one. We allocate more of these than we do even for fences. I was thinking it could be added later, but if we can the api to always pass in the i915_dependency it will probably work better. > > >+ if (!dep) > >+ return -ENOMEM; > >+ > >+ flags |= I915_DEPENDENCY_ALLOC; > >+ } > > Not sure if it would be any nicer to just set the flags after > allocating to I915_DEPENDENCY_ALLOC and add an else path to set it > to zero here. I just tend to avoid if {} else {} if I can help, just a personal preference. > >+struct i915_dependency { > >+ struct i915_priotree *signal; > >+ struct list_head pre_link, post_link; > >+ unsigned long flags; > >+#define I915_DEPENDENCY_ALLOC BIT(0) > >+}; > >+ > >+struct i915_priotree { > >+ struct list_head pre_list; /* who is before us, we depend upon */ > >+ struct list_head post_list; /* who is after us, they depend upon us */ > >+}; > > I need a picture to imagine this data structure. :( The names suck. \|||/ \|/ . /|\ /|||\ Googling bidirectional acyclic graph doesn't help that much. > >+ > > /** > > * Request queue structure. > > * > >@@ -89,6 +101,17 @@ struct drm_i915_gem_request { > > wait_queue_t submitq; > > wait_queue_t execq; > > > >+ /* A list of everyone we wait upon, and everyone who waits upon us. > >+ * Even though we will not be submitted to the hardware before the > >+ * submit fence is signaled (it waits for all external events as well > >+ * as our own requests), the scheduler still needs to know the > >+ * dependency tree for the lifetime of the request (from execbuf > >+ * to retirement), i.e. bidirectional dependency information for the > >+ * request not tied to individual fences. > >+ */ > >+ struct i915_priotree priotree; > >+ struct i915_dependency depq; > > Why depq and not dep_node or something? Makes me wonder what am I > missing about the "q". It is not a queue?!? Just a pattern was forming. It's just a much of a queue as our wait_queue_t above :) Point taken. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx