On Tue, 2021-06-08 at 11:28 +0200, Thomas Hellström wrote: > From: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > If we pipeline the PTE updates and then do the copy of those pages > within a single unpreemptible command packet, we can submit the > copies > and leave them to be scheduled without having to synchronously wait > under a global lock. In order to manage migration, we need to > preallocate the page tables (and keep them pinned and available for > use > at any time), causing a bottleneck for migrations as all clients must > contend on the limited resources. By inlining the ppGTT updates and > performing the blit atomically, each client only owns the PTE while > in > use, and so we can reschedule individual operations however we see > fit. > And most importantly, we do not need to take a global lock on the > shared > vm, and wait until the operation is complete before releasing the > lock > for others to claim the PTE for themselves. > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > Co-developed-by: Thomas Hellström <thomas.hellstrom@xxxxxxxxxxxxxxx> > Signed-off-by: Thomas Hellström <thomas.hellstrom@xxxxxxxxxxxxxxx> > --- > drivers/gpu/drm/i915/Makefile | 1 + > drivers/gpu/drm/i915/gt/intel_engine.h | 1 + > drivers/gpu/drm/i915/gt/intel_gpu_commands.h | 2 + > drivers/gpu/drm/i915/gt/intel_migrate.c | 543 > ++++++++++++++++++ > drivers/gpu/drm/i915/gt/intel_migrate.h | 45 ++ > drivers/gpu/drm/i915/gt/intel_migrate_types.h | 15 + > drivers/gpu/drm/i915/gt/intel_ring.h | 1 + > drivers/gpu/drm/i915/gt/selftest_migrate.c | 291 ++++++++++ > .../drm/i915/selftests/i915_live_selftests.h | 1 + > 9 files changed, 900 insertions(+) > create mode 100644 drivers/gpu/drm/i915/gt/intel_migrate.c > create mode 100644 drivers/gpu/drm/i915/gt/intel_migrate.h > create mode 100644 drivers/gpu/drm/i915/gt/intel_migrate_types.h > create mode 100644 drivers/gpu/drm/i915/gt/selftest_migrate.c > > diff --git a/drivers/gpu/drm/i915/Makefile > b/drivers/gpu/drm/i915/Makefile > index ea8ee4b3e018..9f18902be626 100644 > --- a/drivers/gpu/drm/i915/Makefile > +++ b/drivers/gpu/drm/i915/Makefile > @@ -109,6 +109,7 @@ gt-y += \ > gt/intel_gtt.o \ > gt/intel_llc.o \ > gt/intel_lrc.o \ > + gt/intel_migrate.o \ > gt/intel_mocs.o \ > gt/intel_ppgtt.o \ > gt/intel_rc6.o \ > diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h > b/drivers/gpu/drm/i915/gt/intel_engine.h > index 0862c42b4cac..949965680c37 100644 > --- a/drivers/gpu/drm/i915/gt/intel_engine.h > +++ b/drivers/gpu/drm/i915/gt/intel_engine.h > @@ -188,6 +188,7 @@ intel_write_status_page(struct intel_engine_cs > *engine, int reg, u32 value) > #define I915_GEM_HWS_PREEMPT_ADDR (I915_GEM_HWS_PREEMPT * > sizeof(u32)) > #define I915_GEM_HWS_SEQNO 0x40 > #define I915_GEM_HWS_SEQNO_ADDR (I915_GEM_HWS_SEQNO * > sizeof(u32)) > +#define I915_GEM_HWS_MIGRATE (0x42 * sizeof(u32)) > #define I915_GEM_HWS_SCRATCH 0x80 > > #define I915_HWS_CSB_BUF0_INDEX 0x10 > diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h > b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h > index 2694dbb9967e..1c3af0fc0456 100644 > --- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h > +++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h > @@ -123,8 +123,10 @@ > #define MI_SEMAPHORE_SAD_NEQ_SDD (5 << 12) > #define MI_SEMAPHORE_TOKEN_MASK REG_GENMASK(9, 5) > #define MI_SEMAPHORE_TOKEN_SHIFT 5 > +#define MI_STORE_DATA_IMM MI_INSTR(0x20, 0) > #define MI_STORE_DWORD_IMM MI_INSTR(0x20, 1) > #define MI_STORE_DWORD_IMM_GEN4 MI_INSTR(0x20, 2) > +#define MI_STORE_QWORD_IMM_GEN8 (MI_INSTR(0x20, 3) | REG_BIT(21)) > #define MI_MEM_VIRTUAL (1 << 22) /* 945,g33,965 */ > #define MI_USE_GGTT (1 << 22) /* g4x+ */ > #define MI_STORE_DWORD_INDEX MI_INSTR(0x21, 1) > diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c > b/drivers/gpu/drm/i915/gt/intel_migrate.c > new file mode 100644 > index 000000000000..1f60f8ee36f8 > --- /dev/null > +++ b/drivers/gpu/drm/i915/gt/intel_migrate.c > @@ -0,0 +1,543 @@ > +// SPDX-License-Identifier: MIT > +/* > + * Copyright © 2020 Intel Corporation > + */ > + > +#include "i915_drv.h" > +#include "intel_context.h" > +#include "intel_gpu_commands.h" > +#include "intel_gt.h" > +#include "intel_gtt.h" > +#include "intel_migrate.h" > +#include "intel_ring.h" > + > ... > + > +void intel_migrate_fini(struct intel_migrate *m) > +{ > + struct intel_context *ce; > + > + ce = fetch_and_zero(&m->context); > + if (!ce) > + return; > + > + intel_context_unpin(ce); > + intel_context_put(ce); > +} Hmm, CI hints at we should be exporting and using an intel_engine_destroy_pinned_context() here... /Thomas