On 09/07/2020 13:07, Chris Wilson wrote:
Quoting Tvrtko Ursulin (2020-07-09 12:59:51)
On 09/07/2020 12:07, Chris Wilson wrote:
Quoting Tvrtko Ursulin (2020-07-09 12:01:29)
On 08/07/2020 16:36, Chris Wilson wrote:
Quoting Tvrtko Ursulin (2020-07-08 15:24:20)
And what is the effective behaviour you get with N contexts - emit N
concurrent operations and for N + 1 block in execbuf?
Each context defines a timeline. A task is not ready to run until the
task before it in its timeline is completed. So we don't block in
execbuf, the scheduler waits until the request is ready before putting
it into the HW queues -- i.e. the number chain of fences with everything
that entails about ensuring it runs to completion [whether successfully
or not, if not we then rely on the error propagation to limit the damage
and report it back to the user if they kept a fence around to inspect].
Okay but what is the benefit of N contexts in this series, before the
work is actually spread over ctx async width CPUs? Is there any? If not
I would prefer this patch is delayed until the time some actual
parallelism is ready to be added.
We currently submit an unbounded amount of work. This patch is added
along with its user to restrict the amount of work allowed to run in
parallel, and also is used to [crudely] serialise the multiple threads
attempting to allocate space in the vm when we completely exhaust that
address space. We need at least one fence-context id for each user, this
took the opportunity to generalise that to N ids for each user.
Right, this is what I asked at the beginning - restricting amount of
work run in parallel - does mean there is some "blocking"/serialisation
during execbuf? Or it is all async but then what is restricted?
It's all* async, so the number of workqueues we utilise is restricted,
and so limits the number of CPUs we allow the one context to spread
across with multiple execbufs.
*fsvo all.
Okay.
Related topic - have we ever thought about what happens when fence
context id wraps? I know it's 64-bit, and even with this patch giving
out num_cpus blocks, it still feels impossible that it would wrap in
normal use. But I wonder if malicious client could create/destroy
contexts to cause a wrap and then how well we handle it. I am probably
just underestimating today how big 64-bit is and how many ioctls that
would require..
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx