Re: [PATCH] drm/i915: Prevent bonded requests from overtaking each other on preemption

Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> · Fri, 20 Sep 2019 17:03:34 +0100

Quoting Bloomfield, Jon (2019-09-20 16:50:57)
> > -----Original Message-----
> > From: Intel-gfx <intel-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> On Behalf Of Tvrtko
> > Ursulin
> > Sent: Friday, September 20, 2019 8:12 AM
> > To: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>; intel-gfx@xxxxxxxxxxxxxxxxxxxxx
> > Subject: Re:  [PATCH] drm/i915: Prevent bonded requests from
> > overtaking each other on preemption
> > 
> > 
> > On 20/09/2019 15:57, Chris Wilson wrote:
> > > Quoting Chris Wilson (2019-09-20 09:36:24)
> > >> Force bonded requests to run on distinct engines so that they cannot be
> > >> shuffled onto the same engine where timeslicing will reverse the order.
> > >> A bonded request will often wait on a semaphore signaled by its master,
> > >> creating an implicit dependency -- if we ignore that implicit dependency
> > >> and allow the bonded request to run on the same engine and before its
> > >> master, we will cause a GPU hang.
> > >
> > > Thinking more, it should not directly cause a GPU hang, as the stuck request
> > > should be timesliced away, and each preemption should be enough to keep
> > > hangcheck at bay (though we have evidence it may not). So at best it runs
> > > at half-speed, at worst a third (if my model is correct).
> > 
> > But I think it is still correct to do since we don't have the coupling
> > information on re-submit. Hm.. but don't we need to prevent slave from
> > changing engines as well?
> 
> Unless I'm missing something, the proposal here is to set the engines in stone at first submission, and never change them?

For submission here, think execution (submission to actual HW). (We have
2 separate phases that all like to be called submit()!)

> If so, that does sound overly restrictive, and will prevent any kind of rebalancing as workloads (of varying slave counts) come and go.

We are only restricting this request, not the contexts. We still have
balancing overall, just not instantaneous balancing if we timeslice out
of this request -- we put it back onto the "same" engine and not another.
Which is in some ways is less than ideal, although strictly we are only
saying don't put it back onto an engine we have earmarked for our bonded
request, and so we avoid contending with our parallel request reducing
that to serial (and often bad) behaviour.

[So at the end of this statement, I'm more happy with the restriction ;]

> During the original design it was called out that the workloads should be pre-empted atomically. That allows the entire bonding mask to be re-evaluated at every context switch and so we can then rebalance. Still not easy to achieve I agree :-(

The problem with that statement is that atomic implies a global
scheduling decision. Blood, sweat and tears.

Of course, with your endless scheme, scheduling is all in the purview of
the user :)
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx