Re: [PATCH 05/12] drm/i915/scheduler: Record all dependencies upon request construction

Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> · Fri, 4 Nov 2016 14:44:44 +0000

On 03/11/2016 11:55, Chris Wilson wrote:
On Thu, Nov 03, 2016 at 11:03:47AM +0000, Tvrtko Ursulin wrote:

On 02/11/2016 17:50, Chris Wilson wrote:
The scheduler needs to know the dependencies of each request for the
lifetime of the request, as it may choose to reschedule the requests at
any time and must ensure the dependency tree is not broken. This is in
additional to using the fence to only allow execution after all
dependencies have been completed.

One option was to extend the fence to support the bidirectional
dependency tracking required by the scheduler. However the mismatch in
lifetimes between the submit fence and the request essentially meant
that we had to build a completely separate struct (and we could not
simply reuse the existing waitqueue in the fence for one half of the
dependency tracking). The extra dependency tracking simply did not mesh
well with the fence, and keeping it separate both keeps the fence
implementation simpler and allows us to extend the dependency tracking
into a priority tree (whilst maintaining support for reordering the
tree).

To avoid the additional allocations and list manipulations, the use of
the priotree is disabled when there are no schedulers to use it.

Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
---
drivers/gpu/drm/i915/i915_gem_request.c | 72 ++++++++++++++++++++++++++++++++-
drivers/gpu/drm/i915/i915_gem_request.h | 23 +++++++++++
2 files changed, 94 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
index 9c8605c834f9..13090f226203 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -113,6 +113,59 @@ i915_gem_request_remove_from_client(struct drm_i915_gem_request *request)
	spin_unlock(&file_priv->mm.lock);
}

+static int
+i915_priotree_add_dependency(struct i915_priotree *pt,
+			     struct i915_priotree *signal,
+			     struct i915_dependency *dep)
+{
+	unsigned long flags = 0;
+
+	if (!dep) {
+		dep = kmalloc(sizeof(*dep), GFP_KERNEL);

I will mention a dedicated cache again since this could possibly be
our hottest allocation path. With a dedicated slab I've seen it grow
to 5-7k objects in some benchmarks, with the request slab around 1k
at the same time.

I'm open to one. We allocate more of these than we do even for fences. I
was thinking it could be added later, but if we can the api to always
pass in the i915_dependency it will probably work better.

+		if (!dep)
+			return -ENOMEM;
+
+		flags |= I915_DEPENDENCY_ALLOC;
+	}

Not sure if it would be any nicer to just set the flags after
allocating to I915_DEPENDENCY_ALLOC and add an else path to set it
to zero here.

I just tend to avoid if {} else {} if I can help, just a personal
preference.

+struct i915_dependency {
+	struct i915_priotree *signal;
+	struct list_head pre_link, post_link;
+	unsigned long flags;
+#define I915_DEPENDENCY_ALLOC BIT(0)
+};
+
+struct i915_priotree {
+	struct list_head pre_list; /* who is before us, we depend upon */
+	struct list_head post_list; /* who is after us, they depend upon us */
+};

I need a picture to imagine this data structure. :(

The names suck.

When you wrote this I assumed you would respin shortly with some better 
names?

I tried to grasp it one more time since then but keep getting lost. :I

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx