Quoting Tvrtko Ursulin (2020-07-13 13:22:19) > > On 09/07/2020 13:07, Chris Wilson wrote: > > Quoting Tvrtko Ursulin (2020-07-09 12:59:51) > >> > >> On 09/07/2020 12:07, Chris Wilson wrote: > >>> Quoting Tvrtko Ursulin (2020-07-09 12:01:29) > >>>> > >>>> On 08/07/2020 16:36, Chris Wilson wrote: > >>>>> Quoting Tvrtko Ursulin (2020-07-08 15:24:20) > >>>>>> And what is the effective behaviour you get with N contexts - emit N > >>>>>> concurrent operations and for N + 1 block in execbuf? > >>>>> > >>>>> Each context defines a timeline. A task is not ready to run until the > >>>>> task before it in its timeline is completed. So we don't block in > >>>>> execbuf, the scheduler waits until the request is ready before putting > >>>>> it into the HW queues -- i.e. the number chain of fences with everything > >>>>> that entails about ensuring it runs to completion [whether successfully > >>>>> or not, if not we then rely on the error propagation to limit the damage > >>>>> and report it back to the user if they kept a fence around to inspect]. > >>>> > >>>> Okay but what is the benefit of N contexts in this series, before the > >>>> work is actually spread over ctx async width CPUs? Is there any? If not > >>>> I would prefer this patch is delayed until the time some actual > >>>> parallelism is ready to be added. > >>> > >>> We currently submit an unbounded amount of work. This patch is added > >>> along with its user to restrict the amount of work allowed to run in > >>> parallel, and also is used to [crudely] serialise the multiple threads > >>> attempting to allocate space in the vm when we completely exhaust that > >>> address space. We need at least one fence-context id for each user, this > >>> took the opportunity to generalise that to N ids for each user. > >> > >> Right, this is what I asked at the beginning - restricting amount of > >> work run in parallel - does mean there is some "blocking"/serialisation > >> during execbuf? Or it is all async but then what is restricted? > > > > It's all* async, so the number of workqueues we utilise is restricted, > > and so limits the number of CPUs we allow the one context to spread > > across with multiple execbufs. > > > > *fsvo all. > > Okay. > > Related topic - have we ever thought about what happens when fence > context id wraps? I know it's 64-bit, and even with this patch giving > out num_cpus blocks, it still feels impossible that it would wrap in > normal use. But I wonder if malicious client could create/destroy > contexts to cause a wrap and then how well we handle it. I am probably > just underestimating today how big 64-bit is and how many ioctls that > would require.. I've had cold sweats. We will get silent glitches. I *don't* think we will corrupt kernel data and oops, but we will corrupt user data. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx