Re: [PATCH] drm/i915: use a separate context for gpu relocs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 8/24/2019 1:54 AM, Chris Wilson wrote:
Quoting Daniele Ceraolo Spurio (2019-08-24 01:20:22)
The CS pre-parser can pre-fetch commands across memory sync points and
starting from gen12 it is able to pre-fetch across BB_START and BB_END
boundaries as well, so when we emit gpu relocs the pre-parser might
fetch the target location of the reloc before the memory write lands.

The parser can't pre-fetch across the ctx switch, so we use a separate
context to guarantee that the memory is syncronized before the parser
can get to it.

Suggested-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@xxxxxxxxx>
Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
---
  .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 27 ++++++++++++++++++-
  1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index b5f6937369ea..d27201c654e9 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -252,6 +252,7 @@ struct i915_execbuffer {
                 bool has_fence : 1;
                 bool needs_unfenced : 1;
+ struct intel_context *ce;
                 struct i915_request *rq;
                 u32 *rq_cmd;
                 unsigned int rq_size;
@@ -903,6 +904,7 @@ static void reloc_cache_init(struct reloc_cache *cache,
         cache->has_fence = cache->gen < 4;
         cache->needs_unfenced = INTEL_INFO(i915)->unfenced_needs_alignment;
         cache->node.allocated = false;
+       cache->ce = NULL;
         cache->rq = NULL;
         cache->rq_size = 0;
  }
@@ -943,6 +945,7 @@ static void reloc_gpu_flush(struct reloc_cache *cache)
  static void reloc_cache_reset(struct reloc_cache *cache)
  {
         void *vaddr;
+       struct intel_context *ce;
if (cache->rq)
                 reloc_gpu_flush(cache);
@@ -973,6 +976,10 @@ static void reloc_cache_reset(struct reloc_cache *cache)
                 }
         }
+ ce = fetch_and_zero(&cache->ce);
+       if (ce)
+               intel_context_put(ce);
We don't need to create a new context between every buffer, it can be
released in eb_destroy.

+
         cache->vaddr = 0;
         cache->page = -1;
  }
@@ -1168,7 +1175,7 @@ static int __reloc_gpu_alloc(struct i915_execbuffer *eb,
         if (err)
                 goto err_unmap;
- rq = i915_request_create(eb->context);
+       rq = intel_context_create_request(cache->ce);
         if (IS_ERR(rq)) {
                 err = PTR_ERR(rq);
                 goto err_unpin;
@@ -1239,6 +1246,24 @@ static u32 *reloc_gpu(struct i915_execbuffer *eb,
                 if (!intel_engine_can_store_dword(eb->engine))
                         return ERR_PTR(-ENODEV);
+ /*
+                * The CS pre-parser can pre-fetch commands across memory sync
+                * points and starting gen12 it is able to pre-fetch across
+                * BB_START and BB_END boundaries (within the same context). We
+                * use a separate context to guarantee that the reloc writes
+                * land before the parser gets to the target memory location.
+                */
+               if (!cache->ce) {
+                       struct intel_context *ce;
+
I was thinking of

	if (cache->gen >= 12)
+                       ce = intel_context_create(eb->context->gem_context,
+                                                 eb->engine);
else
	ce = intel_context_get(eb->context);

Note that this requires us to fix the tagging so that we don't perform a
lite-restore from the reloc instance to the user instance.

What's wrong with lite-restoring in this case? It's not something we stop now AFAICS.

Daniele


But yes, that's more or less the same as the sketch I did :)
-Chris

_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux