On 25/09/2019 09:23, Chris Wilson wrote:
Quoting Tvrtko Ursulin (2019-09-23 09:10:26)
On 20/09/2019 17:35, Chris Wilson wrote:
Quoting Tvrtko Ursulin (2019-09-20 17:22:42)
On 02/09/2019 05:02, Chris Wilson wrote:
Since we cannot allocate underneath the vm->mutex (it is used in the
direct-reclaim paths), we need to shift the allocations off into a
mutexless worker with fence recursion prevention. To know when we need
this protection, we mark up the address spaces that do allocate before
insertion.
Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
---
drivers/gpu/drm/i915/i915_gem_gtt.c | 3 +++
drivers/gpu/drm/i915/i915_gem_gtt.h | 2 ++
2 files changed, 5 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 9095f017162e..56d27cf09a3d 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1500,6 +1500,7 @@ static struct i915_ppgtt *gen8_ppgtt_create(struct drm_i915_private *i915)
goto err_free_pd;
}
+ ppgtt->vm.bind_alloc = I915_VMA_LOCAL_BIND;
So this is re-using I915_VMA_LOCAL_BIND as a trick? Is it clear how that
works from these call sites? Should it be called bind_alloc*s*?
bind_allocates? Or be a boolean which is converted to a trick flag in
i915_vma_bind where a comment can be put explaining the trick?
Is it a trick? We need to differentiate between requests for LOCAL_BIND,
GLOBAL_BIND, LOCAL_BIND | GLOBAL_BIND, for different types of vm. Then I
have a plan on using the worker for GLOBAL_BIND on bsw/bxt to defer the
stop_machine().
What's the connection between "mark up the address spaces that do
allocate before insertion" and I915_VMA_LOCAL_BIND?
Full-ppgtt is only accessible by PIN_USER.
Aliasing-ppgtt is accessible from global-gtt as PIN_USER. Only if we
have an aliasing-gtt behind ggtt do we want to allocate for ggtt for
local binds.
global-gtt by itself never allocates and is expected to be synchronous.
However, we do use stop_machine() for bxt/bsw and that unfortunately is
marked as an allocating mutex so one idea I had for avoiding that
lockdep splat was to make bxt/bsw PIN_GLOBAL async.
I think we are not understanding each other from the very start.
My point was that "vm.bind_alloc = I915_VMA_LOCAL_BIND", at least my
understanding, effectively means "use the worker when pinning/binding
PIN_USER/I915_VMA_LOCAL_BIND". And that is I think non-obvious. Where
you have in the code:
if (flags & vma->vm->bind_alloc)
It is a shorter hacky way of saying:
if (*flags & I915_VMA_LOCAL_BIND) &&
vma->vm->bind_allocates)
Or where you have:
if (work && (bind_flags & ~vma_flags) & vma->vm->bind_alloc) {
This would be:
if (work &&
vma->vm->bind_allocates &&
(bind_flags & I915_VMA_LOCAL_BIND) &&
!(vma_flags & I915_VMA_LOCAL_BIND)) {
But I think I see now what your code is actually saying, you are having
vm->bind_alloc mean vm->bind_flags_which_allocate. Did I get your
thinking right now? If so compromise with renaming to vm->bind_alloc_flags?
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx