Quoting Tvrtko Ursulin (2020-12-01 09:26:05) > > On 30/11/2020 16:44, Chris Wilson wrote: > > Quoting Tvrtko Ursulin (2020-11-30 16:26:44) > >> > >> On 30/11/2020 16:21, Chris Wilson wrote: > >>> Quoting Tvrtko Ursulin (2020-11-30 16:07:44) > >>>> > >>>> On 30/11/2020 13:39, Chris Wilson wrote: > >>>>> Quoting Tvrtko Ursulin (2020-11-30 13:12:55) > >>>>>> On 28/11/2020 18:40, Chris Wilson wrote: > >>>>>>> If we pipeline the PTE updates and then do the copy of those pages > >>>>>>> within a single unpreemptible command packet, we can submit the copies > >>>>>>> and leave them to be scheduled without having to synchronously wait > >>>>>>> under a global lock. > >>>>>>> > >>>>>>> Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > >>>>>>> --- > >>>>>>> drivers/gpu/drm/i915/Makefile | 1 + > >>>>>>> drivers/gpu/drm/i915/gt/intel_engine.h | 1 + > >>>>>>> drivers/gpu/drm/i915/gt/intel_migrate.c | 370 ++++++++++++++++++ > >>>>>>> drivers/gpu/drm/i915/gt/intel_migrate.h | 33 ++ > >>>>>>> drivers/gpu/drm/i915/gt/selftest_migrate.c | 105 +++++ > >>>>>>> .../drm/i915/selftests/i915_live_selftests.h | 1 + > >>>>>>> 6 files changed, 511 insertions(+) > >>>>>>> create mode 100644 drivers/gpu/drm/i915/gt/intel_migrate.c > >>>>>>> create mode 100644 drivers/gpu/drm/i915/gt/intel_migrate.h > >>>>>>> create mode 100644 drivers/gpu/drm/i915/gt/selftest_migrate.c > >>>>>>> > >>>>>>> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile > >>>>>>> index e5574e506a5c..0b2e12c87f9d 100644 > >>>>>>> --- a/drivers/gpu/drm/i915/Makefile > >>>>>>> +++ b/drivers/gpu/drm/i915/Makefile > >>>>>>> @@ -103,6 +103,7 @@ gt-y += \ > >>>>>>> gt/intel_gtt.o \ > >>>>>>> gt/intel_llc.o \ > >>>>>>> gt/intel_lrc.o \ > >>>>>>> + gt/intel_migrate.o \ > >>>>>>> gt/intel_mocs.o \ > >>>>>>> gt/intel_ppgtt.o \ > >>>>>>> gt/intel_rc6.o \ > >>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h > >>>>>>> index ac58fcda4927..079d26b47a97 100644 > >>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine.h > >>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine.h > >>>>>>> @@ -188,6 +188,7 @@ intel_write_status_page(struct intel_engine_cs *engine, int reg, u32 value) > >>>>>>> #define I915_GEM_HWS_PREEMPT_ADDR (I915_GEM_HWS_PREEMPT * sizeof(u32)) > >>>>>>> #define I915_GEM_HWS_SEQNO 0x40 > >>>>>>> #define I915_GEM_HWS_SEQNO_ADDR (I915_GEM_HWS_SEQNO * sizeof(u32)) > >>>>>>> +#define I915_GEM_HWS_MIGRATE (0x42 * sizeof(u32)) > >>>>>>> #define I915_GEM_HWS_SCRATCH 0x80 > >>>>>>> > >>>>>>> #define I915_HWS_CSB_BUF0_INDEX 0x10 > >>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c b/drivers/gpu/drm/i915/gt/intel_migrate.c > >>>>>>> new file mode 100644 > >>>>>>> index 000000000000..4d7bd32eb8d4 > >>>>>>> --- /dev/null > >>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_migrate.c > >>>>>>> @@ -0,0 +1,370 @@ > >>>>>>> +// SPDX-License-Identifier: MIT > >>>>>>> +/* > >>>>>>> + * Copyright © 2020 Intel Corporation > >>>>>>> + */ > >>>>>>> + > >>>>>>> +#include "i915_drv.h" > >>>>>>> +#include "intel_context.h" > >>>>>>> +#include "intel_gt.h" > >>>>>>> +#include "intel_gtt.h" > >>>>>>> +#include "intel_lrc.h" /* virtual engine */ > >>>>>>> +#include "intel_migrate.h" > >>>>>>> +#include "intel_ring.h" > >>>>>>> + > >>>>>>> +#define CHUNK_SZ SZ_8M /* ~1ms at 8GiB/s preemption delay */ > >>>>>>> + > >>>>>>> +static void insert_pte(struct i915_address_space *vm, > >>>>>>> + struct i915_page_table *pt, > >>>>>>> + void *data) > >>>>>>> +{ > >>>>>>> + u64 *offset = data; > >>>>>>> + > >>>>>>> + vm->insert_page(vm, px_dma(pt), *offset, I915_CACHE_NONE, 0); > >>>>>>> + *offset += PAGE_SIZE; > >>>>>>> +} > >>>>>>> + > >>>>>>> +static struct i915_address_space *migrate_vm(struct intel_gt *gt) > >>>>>>> +{ > >>>>>>> + struct i915_vm_pt_stash stash = {}; > >>>>>>> + struct i915_ppgtt *vm; > >>>>>>> + u64 offset, sz; > >>>>>>> + int err; > >>>>>>> + > >>>>>>> + vm = i915_ppgtt_create(gt); > >>>>>>> + if (IS_ERR(vm)) > >>>>>>> + return ERR_CAST(vm); > >>>>>>> + > >>>>>>> + if (!vm->vm.allocate_va_range || !vm->vm.foreach) { > >>>>>>> + err = -ENODEV; > >>>>>>> + goto err_vm; > >>>>>>> + } > >>>>>>> + > >>>>>>> + /* > >>>>>>> + * We copy in 8MiB chunks. Each PDE covers 2MiB, so we need > >>>>>>> + * 4x2 page directories for source/destination. > >>>>>>> + */ > >>>>>>> + sz = 2 * CHUNK_SZ; > >>>>>>> + offset = sz; > >>>>>>> + > >>>>>>> + /* > >>>>>>> + * We need another page directory setup so that we can write > >>>>>>> + * the 8x512 PTE in each chunk. > >>>>>>> + */ > >>>>>>> + sz += (sz >> 12) * sizeof(u64); > >>>>>>> + > >>>>>>> + err = i915_vm_alloc_pt_stash(&vm->vm, &stash, sz); > >>>>>>> + if (err) > >>>>>>> + goto err_vm; > >>>>>>> + > >>>>>>> + err = i915_vm_pin_pt_stash(&vm->vm, &stash); > >>>>>>> + if (err) { > >>>>>>> + i915_vm_free_pt_stash(&vm->vm, &stash); > >>>>>>> + goto err_vm; > >>>>>>> + } > >>>>>>> + > >>>>>>> + vm->vm.allocate_va_range(&vm->vm, &stash, 0, sz); > >>>>>>> + i915_vm_free_pt_stash(&vm->vm, &stash); > >>>>>>> + > >>>>>>> + /* Now allow the GPU to rewrite the PTE via its own ppGTT */ > >>>>>>> + vm->vm.foreach(&vm->vm, 0, sz, insert_pte, &offset); > >>>>>> > >>>>>> This is just making the [0 - sz) gva point to the allocated sz bytes of > >>>>>> backing store? > >>>>> > >>>>> Not quite, we are pointing [offset, sz) to the page directories > >>>>> themselves. When we write into that range, we are modifying the PTE of > >>>>> this ppGTT. (Breaking the fourth wall, allowing the non-privileged > >>>>> context to update its own page tables.) > >>>> > >>>> Confusing, so first step is here making [0, sz) gva ptes point to pte > >>>> backing store. > >>>> > >>>> Which means emit_pte when called from intel_context_migrate_pages is > >>>> emitting sdw which will write into [0, sz] gva, effectively placing the > >>>> src and dest object content at those location. > >>>> > >>>> Then blit copies the data. > >>>> > >>>> What happens if context is re-used? Gva's no longer point to PTEs so > >>>> how? Is a step to re-initialize after use missing? More sdw after the > >>>> blit copy to make it point back to ptes? > >>> > >>> Every chunk starts with writing the PTEs used by the chunk. Within the > >>> chunk, the GPU commands are not preemptible, so the PTEs cannot be > >>> replaced before they are consumed. > >>> > >>> We do leave the PTE pointing to stale pages after the blit, but since > >>> the vm is only used by first rewriting the PTE, those dangling PTE > >>> should never be chased. > >>> > >>> So each chunk does a fixed copy between the same pair of addresses > >>> within the vm; the magic is in rewriting the PTE to point at the pages > >>> of interest for the chunk. > >> > >> Isn't the vm getting potentially re-used on the fall-back path, if > >> virtual engine creation fails? In which case dangling PTEs would be > >> written to. Or I am still missing something? > > > > The same vm is used by all contexts (only one is created). > > Right, true. I'll try again because the overall scheme is still > confusing me.. > > Migrate_vm makes [0, 2 * CHUNK_SZ) point to PTE backing store for the > same range. [2 * CHUNK_SZ, 2 * CHUNK_SZ + 2 * CHUNK_SZ >> 9] > Src emit_pte writes up to [0, CHUNK_SZ) effectively re-writing PTEs for > the same range Apart from the PD backpointers start at 2 * CHUNK_SZ... > Dest emit_pte same for the destination range That starts at 2 * CHUNK_SZ + CHUNK_SZ >> 9 So I passed in the target address range to emit_pte len = emit_pte(rq, &it_src, encode, 0, CHUNK_SZ); err = emit_pte(rq, &it_dst, encode, CHUNK_SZ, len); but the first thing emit_pte does is compute the PD offset for that range: offset >>= 12; offset *= sizeof(u64); offset += 2 * CHUNK_SZ; > Virtual address [0, 2 * CHUNK_SZ) ptes now point to src and dest objects > backing store. Yes. > Blit copy emits blitter commands to copy between src and dest. > > Now second copy comes and calls emit_pte which again writes to [0, > CHUNK_SZ) virtual range. How does that end up in the PTE backing store > and not previous object backing store? It all boils down to the PD being offset. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx