Re: [PATCH 2/3] drm/i915: Allow sharing the idle-barrier from other kernel requests

Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> · Thu, 01 Aug 2019 18:55:53 +0100

Quoting Tvrtko Ursulin (2019-08-01 18:24:50)
> 
> On 30/07/2019 12:21, Chris Wilson wrote:
> > @@ -224,8 +283,15 @@ int i915_active_ref(struct i915_active *ref,
> >               goto out;
> >       }
> >   
> > -     if (!i915_active_request_isset(active))
> > -             atomic_inc(&ref->count);
> > +     if (is_barrier(active)) { /* proto-node used by our idle barrier */
> > +             __active_del_barrier(ref, node_from_active(active));
> > +             RCU_INIT_POINTER(active->request, NULL);
> > +             INIT_LIST_HEAD(&active->link);
> 
> So this, when prepare_remote_request calls it on ce->active, will remove 
> the idle barrier from the engine->barriers_list and immediately transfer 
> it to the rq->active_list?

Yes. So long as the context matches the idle-barrier, i.e. we are
operating on this context from the kernel_context.

> Then when this request is retired we know the 
> context is idle?

So long as the context itself hasn't claimed a new active reference for
its own requests.

> But what if it is the same context? Then it is not idle yet.

Correct, but that would only happens for the kernel_context, which is the
special case where we force idle on parking and exclude from the
deferred active-barrier (as it would continually defer instead of
parking).

> > +static inline bool is_idle_barrier(struct active_node *node, u64 idx)
> > +{
> > +     return node->timeline == idx && !i915_active_request_isset(&node->base);
> > +}
> > +
> > +static struct active_node *idle_barrier(struct i915_active *ref, u64 idx)
> > +{
> > +     struct rb_node *prev, *p;
> > +
> > +     if (RB_EMPTY_ROOT(&ref->tree))
> > +             return NULL;
> > +
> > +     mutex_lock(&ref->mutex);
> > +     GEM_BUG_ON(i915_active_is_idle(ref));
> > +
> > +     /*
> > +      * Try to reuse any existing barrier nodes already allocated for this
> > +      * i915_active, due to overlapping active phases there is likely a
> 
> For this i915_active or for this idx? It is this i915_active by 
> definition, no? And idx also has to match in both loops below.

idx is a timeline, which although would normally be unique in the tree,
due to the active-barrier we may have more than one instance, because we
may have overlapping i915_active phases of activity.

This function exists to try and reuse any of the older phases that we
could not reap as we never completely idled. (Before we would
accidentally keep growing the tree with new nodes until we hit a point
where we retired the idle timeline after parking.)

> > +      * node kept alive (as we reuse before parking). We prefer to reuse
> > +      * completely idle barriers (less hassle in manipulating the llists),
> > +      * but otherwise any will do.
> > +      */
> > +     if (ref->cache && is_idle_barrier(ref->cache, idx)) {
> > +             p = &ref->cache->node;
> > +             goto match;
> > +     }
> > +
> > +     prev = NULL;
> > +     p = ref->tree.rb_node;
> > +     while (p) {
> > +             struct active_node *node =
> > +                     rb_entry(p, struct active_node, node);
> > +
> > +             if (is_idle_barrier(node, idx))
> > +                     goto match;
> > +
> > +             prev = p;
> > +             if (node->timeline < idx)
> > +                     p = p->rb_right;
> > +             else
> > +                     p = p->rb_left;
> > +     }
> > +
> > +     for (p = prev; p; p = rb_next(p)) {
> > +             struct active_node *node =
> > +                     rb_entry(p, struct active_node, node);
> > +
> > +             if (node->timeline > idx)
> > +                     break;
> > +
> > +             if (node->timeline < idx)
> > +                     continue;
> > +
> > +             if (!i915_active_request_isset(&node->base))
> > +                     goto match;
> 
> Isn't this idle barrier, so same as above? How can above not find it, 
> and find it here?

We didn't check all nodes, only followed the binary tree down one
branch to find the starting point. I anticipate we may end up with more
than one idle-barrier in the tree.

> > +
> > +             if (is_barrier(&node->base) && __active_del_barrier(ref, node))
> > +                     goto match;
> 
> And this is yet unused barrier with the right idx. Under what 
> circumstances can __active_del_barrier fail then? I think a comment is 
> needed.

__active_del_barrier() here can theoretically race with
i915_request_add_active_barriers(). In the other callsite, we must be
holding the kernel_context timeline mutex and so cannot be concurrently
calling add_active_barriers(). [Here is the weakness of using
__active_del_barrier(), in order to reuse an activated idle-barrier, we
may cause add_active_barriers() to miss the others and so postpone the
retirement of other contexts. Under the current scheme of parking, that
is not an issue. In the future, it means retirement may miss a
heartbeat.]

> > +             if (!i915_active_request_isset(&node->base)) {
> > +                     RCU_INIT_POINTER(node->base.request, ERR_PTR(-EAGAIN));
> > +                     node->base.link.prev = (void *)engine;
> 
> Comment here please - what for and why it is safe.

We're simply stashing pointers inside a block of memory that we own and
pass to ourselves later on. We've made sure it is not accessible from
the tree.

> > +                     atomic_inc(&ref->count);
> > +             }
> >   
> > +             GEM_BUG_ON(barrier_to_engine(node) != engine);
> > +             llist_add(barrier_to_ll(node), &ref->preallocated_barriers);
> >               intel_engine_pm_get(engine);
> > -             llist_add((struct llist_node *)&node->base.link,
> > -                       &ref->barriers);
> >       }
> >   
> >       return 0;
> >   
> >   unwind:
> > -     llist_for_each_safe(pos, next, llist_del_all(&ref->barriers)) {
> > -             struct active_node *node;
> > +     llist_for_each_safe(pos, next, get_preallocated_barriers(ref)) {
> 
> s/get/take/ or remove, extract?

take, hopefully there's a column spare.

> It's extremely complicated!

No kidding! last_retired_context was so much easier to understand.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx