Tim Gore Intel Corporation (UK) Ltd. - Co. Reg. #1134945 - Pipers Way, Swindon SN3 1RJ > -----Original Message----- > From: Intel-gfx [mailto:intel-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx] On Behalf > Of akash.goel@xxxxxxxxx > Sent: Tuesday, March 22, 2016 8:43 AM > To: intel-gfx@xxxxxxxxxxxxxxxxxxxxx > Cc: Goel, Akash > Subject: [PATCH v9] drm/i915: Support to enable TRTT on GEN9 > > From: Akash Goel <akash.goel@xxxxxxxxx> > > Gen9 has an additional address translation hardware support in form of Tiled > Resource Translation Table (TR-TT) which provides an extra level of > abstraction over PPGTT. > This is useful for mapping Sparse/Tiled texture resources. > Sparse resources are created as virtual-only allocations. Regions of the > resource that the application intends to use is bound to the physical memory > on the fly and can be re-bound to different memory allocations over the > lifetime of the resource. > > TR-TT is tightly coupled with PPGTT, a new instance of TR-TT will be required > for a new PPGTT instance, but TR-TT may not enabled for every context. > 1/16th of the 48bit PPGTT space is earmarked for the translation by TR-TT, > which such chunk to use is conveyed to HW through a register. > Any GFX address, which lies in that reserved 44 bit range will be translated > through TR-TT first and then through PPGTT to get the actual physical > address, so the output of translation from TR-TT will be a PPGTT offset. > > TRTT is constructed as a 3 level tile Table. Each tile is 64KB is size which leaves > behind 44-16=28 address bits. 28bits are partitioned as 9+9+10, and each > level is contained within a 4KB page hence L3 and L2 is composed of > 512 64b entries and L1 is composed of 1024 32b entries. > > There is a provision to keep TR-TT Tables in virtual space, where the pages of > TRTT tables will be mapped to PPGTT. > Currently this is the supported mode, in this mode UMD will have a full > control on TR-TT management, with bare minimum support from KMD. > So the entries of L3 table will contain the PPGTT offset of L2 Table pages, > similarly entries of L2 table will contain the PPGTT offset of L1 Table pages. > The entries of L1 table will contain the PPGTT offset of BOs actually backing > the Sparse resources. > UMD will have to allocate the L3/L2/L1 table pages as a regular BO only & > assign them a PPGTT address through the Soft Pin API (for example, use soft > pin to assign l3_table_address to the L3 table BO, when used). > UMD will also program the entries in the TR-TT page tables using regular > batch commands (MI_STORE_DATA_IMM), or via mmapping of the page > table BOs. > UMD may do the complete PPGTT address space management, on the > pretext that it could help minimize the conflicts. > > Any space in TR-TT segment not bound to any Sparse texture, will be handled > through Invalid tile, User is expected to initialize the entries of a new > L3/L2/L1 table page with the Invalid tile pattern. The entries corresponding to > the holes in the Sparse texture resource will be set with the Null tile pattern > The improper programming of TRTT should only lead to a recoverable GPU > hang, eventually leading to banning of the culprit context without victimizing > others. > > The association of any Sparse resource with the BOs will be known only to > UMD, and only the Sparse resources shall be assigned an offset from the TR- > TT segment by UMD. The use of TR-TT segment or mapping of Sparse > resources will be transparent to the KMD, UMD will do the address > assignment from TR-TT segment autonomously and KMD will be oblivious of > it. > Any object must not be assigned an address from TR-TT segment, they will > be mapped to PPGTT in a regular way by KMD. > > This patch provides an interface through which UMD can convey KMD to > enable TR-TT for a given context. A new I915_CONTEXT_PARAM_TRTT param > has been added to I915_GEM_CONTEXT_SETPARAM ioctl for that purpose. > UMD will have to pass the GFX address of L3 table page, start location of TR- > TT segment alongwith the pattern value for the Null & invalid Tile registers. > > v2: > - Support context_getparam for TRTT also and dispense with a separate > GETPARAM case for TRTT (Chris). > - Use i915_dbg to log errors for the invalid TRTT ABI parameters passed > from user space (Chris). > - Move all the argument checking for TRTT in context_setparam to the > set_trtt function (Chris). > - Change the type of 'flags' field inside 'intel_context' to unsigned (Chris) > - Rename certain functions to rightly reflect their purpose, rename > the new param for TRTT in gem_context_param to > I915_CONTEXT_PARAM_TRTT, > rephrase few lines in the commit message body, add more comments > (Chris). > - Extend ABI to allow User specify TRTT segment location also. > - Fix for selective enabling of TRTT on per context basis, explicitly > disable TR-TT at the start of a new context. > > v3: > - Check the return value of gen9_emit_trtt_regs (Chris) > - Update the kernel doc for intel_context structure. > - Rebased. > > v4: > - Fix the warnings reported by 'checkpatch.pl --strict' (Michel) > - Fix the context_getparam implementation avoiding the reset of size field, > affecting the TRTT case. > > v5: > - Update the TR-TT params right away in context_setparam, by constructing > & submitting a request emitting LRIs, instead of deferring it and > conflating with the next batch submission (Chris) > - Follow the struct_mutex handling related prescribed rules, while accessing > User space buffer, both in context_setparam & getparam functions (Chris). > > v6: > - Fix the warning caused due to removal of un-allocated trtt vma node. > > v7: > - Move context ref/unref to context_setparam_ioctl from set_trtt() & > remove > that from get_trtt() as not really needed there (Chris). > - Add a check for improper values for Null & Invalid Tiles. > - Remove superfluous DRM_ERROR from trtt_context_allocate_vma (Chris). > - Rebased. > > v8: > - Add context ref/unref to context_getparam_ioctl also so as to be > consistent > and ease the extension of ioctl in future (Chris) > > v9: > - Fix the handling of return value from trtt_context_allocate_vma() function, > causing kernel panic at the time of destroying context, in case of > unsuccessful allocation of trtt vma. > - Rebased. > > Testcase: igt/gem_trtt > > Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > Cc: Michel Thierry <michel.thierry@xxxxxxxxx> > Signed-off-by: Akash Goel <akash.goel@xxxxxxxxx> > Reviewed-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > --- > drivers/gpu/drm/i915/i915_drv.h | 16 +++- > drivers/gpu/drm/i915/i915_gem_context.c | 157 > +++++++++++++++++++++++++++++++- > drivers/gpu/drm/i915/i915_gem_gtt.c | 65 +++++++++++++ > drivers/gpu/drm/i915/i915_gem_gtt.h | 8 ++ > drivers/gpu/drm/i915/i915_reg.h | 19 ++++ > drivers/gpu/drm/i915/intel_lrc.c | 124 ++++++++++++++++++++++++- > drivers/gpu/drm/i915/intel_lrc.h | 1 + > include/uapi/drm/i915_drm.h | 8 ++ > 8 files changed, 393 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_drv.h > b/drivers/gpu/drm/i915/i915_drv.h index ecbd418..272d1f8 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -804,6 +804,7 @@ struct i915_ctx_hang_stats { #define > DEFAULT_CONTEXT_HANDLE 0 > > #define CONTEXT_NO_ZEROMAP (1<<0) > +#define CONTEXT_USE_TRTT (1 << 1) > /** > * struct intel_context - as the name implies, represents a context. > * @ref: reference count. > @@ -818,6 +819,8 @@ struct i915_ctx_hang_stats { > * @ppgtt: virtual memory space used by this context. > * @legacy_hw_ctx: render context backing object and whether it is > correctly > * initialized (legacy ring submission mechanism only). > + * @trtt_info: Programming parameters for tr-tt (redirection tables for > + * userspace, for sparse resource management) > * @link: link in the global list of contexts. > * > * Contexts are memory images used by the hardware to store copies of > their @@ -828,7 +831,7 @@ struct intel_context { > int user_handle; > uint8_t remap_slice; > struct drm_i915_private *i915; > - int flags; > + unsigned int flags; > struct drm_i915_file_private *file_priv; > struct i915_ctx_hang_stats hang_stats; > struct i915_hw_ppgtt *ppgtt; > @@ -849,6 +852,15 @@ struct intel_context { > uint32_t *lrc_reg_state; > } engine[I915_NUM_ENGINES]; > > + /* TRTT info */ > + struct intel_context_trtt { > + u32 invd_tile_val; > + u32 null_tile_val; > + u64 l3_table_address; > + u64 segment_base_addr; > + struct i915_vma *vma; > + } trtt_info; > + > struct list_head link; > }; > > @@ -2657,6 +2669,8 @@ struct drm_i915_cmd_table { > !IS_VALLEYVIEW(dev) && > !IS_CHERRYVIEW(dev) && \ > !IS_BROXTON(dev)) > > +#define HAS_TRTT(dev) (IS_GEN9(dev)) > + A very minor point, but there is a w/a to disable TRTT for BXT_REVID_A0/1. I realise this is basically obsolete now, but I'm still using one! > #define INTEL_PCH_DEVICE_ID_MASK 0xff00 > #define INTEL_PCH_IBX_DEVICE_ID_TYPE 0x3b00 > #define INTEL_PCH_CPT_DEVICE_ID_TYPE 0x1c00 > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c > b/drivers/gpu/drm/i915/i915_gem_context.c > index 394e525..5f28c23 100644 > --- a/drivers/gpu/drm/i915/i915_gem_context.c > +++ b/drivers/gpu/drm/i915/i915_gem_context.c > @@ -133,6 +133,14 @@ static int get_context_size(struct drm_device *dev) > return ret; > } > > +static void intel_context_free_trtt(struct intel_context *ctx) { > + if (!ctx->trtt_info.vma) > + return; > + > + intel_trtt_context_destroy_vma(ctx->trtt_info.vma); > +} > + > static void i915_gem_context_clean(struct intel_context *ctx) { > struct i915_hw_ppgtt *ppgtt = ctx->ppgtt; @@ -164,6 +172,8 @@ > void i915_gem_context_free(struct kref *ctx_ref) > */ > i915_gem_context_clean(ctx); > > + intel_context_free_trtt(ctx); > + > i915_ppgtt_put(ctx->ppgtt); > > if (ctx->legacy_hw_ctx.rcs_state) > @@ -507,6 +517,129 @@ i915_gem_context_get(struct > drm_i915_file_private *file_priv, u32 id) > return ctx; > } > > +static int > +intel_context_get_trtt(struct intel_context *ctx, > + struct drm_i915_gem_context_param *args) { > + struct drm_i915_gem_context_trtt_param trtt_params; > + struct drm_device *dev = ctx->i915->dev; > + > + if (!HAS_TRTT(dev) || !USES_FULL_48BIT_PPGTT(dev)) { > + return -ENODEV; > + } else if (args->size < sizeof(trtt_params)) { > + args->size = sizeof(trtt_params); > + } else { > + trtt_params.segment_base_addr = > + ctx->trtt_info.segment_base_addr; > + trtt_params.l3_table_address = > + ctx->trtt_info.l3_table_address; > + trtt_params.null_tile_val = > + ctx->trtt_info.null_tile_val; > + trtt_params.invd_tile_val = > + ctx->trtt_info.invd_tile_val; > + > + mutex_unlock(&dev->struct_mutex); > + > + if (__copy_to_user(to_user_ptr(args->value), > + &trtt_params, > + sizeof(trtt_params))) { > + mutex_lock(&dev->struct_mutex); > + return -EFAULT; > + } > + > + args->size = sizeof(trtt_params); > + mutex_lock(&dev->struct_mutex); > + } > + > + return 0; > +} > + > +static int > +intel_context_set_trtt(struct intel_context *ctx, > + struct drm_i915_gem_context_param *args) { > + struct drm_i915_gem_context_trtt_param trtt_params; > + struct i915_vma *vma; > + struct drm_device *dev = ctx->i915->dev; > + int ret; > + > + if (!HAS_TRTT(dev) || !USES_FULL_48BIT_PPGTT(dev)) > + return -ENODEV; > + else if (ctx->flags & CONTEXT_USE_TRTT) > + return -EEXIST; > + else if (args->size < sizeof(trtt_params)) > + return -EINVAL; > + > + mutex_unlock(&dev->struct_mutex); > + > + if (copy_from_user(&trtt_params, > + to_user_ptr(args->value), > + sizeof(trtt_params))) { > + mutex_lock(&dev->struct_mutex); > + ret = -EFAULT; > + goto exit; > + } > + > + mutex_lock(&dev->struct_mutex); > + > + /* Check if the setup happened from another path */ > + if (ctx->flags & CONTEXT_USE_TRTT) { > + ret = -EEXIST; > + goto exit; > + } > + > + /* basic sanity checks for the segment location & l3 table pointer */ > + if (trtt_params.segment_base_addr & (GEN9_TRTT_SEGMENT_SIZE - > 1)) { > + i915_dbg(dev, "segment base address not correctly > aligned\n"); > + ret = -EINVAL; > + goto exit; > + } > + > + if (((trtt_params.l3_table_address + PAGE_SIZE) >= > + trtt_params.segment_base_addr) && > + (trtt_params.l3_table_address < > + (trtt_params.segment_base_addr + > GEN9_TRTT_SEGMENT_SIZE))) { > + i915_dbg(dev, "l3 table address conflicts with trtt > segment\n"); > + ret = -EINVAL; > + goto exit; > + } > + > + if (trtt_params.l3_table_address & > ~GEN9_TRTT_L3_GFXADDR_MASK) { > + i915_dbg(dev, "invalid l3 table address\n"); > + ret = -EINVAL; > + goto exit; > + } > + > + if (trtt_params.null_tile_val == trtt_params.invd_tile_val) { > + i915_dbg(dev, "incorrect values for null & invalid tiles\n"); > + return -EINVAL; > + } > + > + vma = intel_trtt_context_allocate_vma(&ctx->ppgtt->base, > + trtt_params.segment_base_addr); > + if (IS_ERR(vma)) { > + ret = PTR_ERR(vma); > + goto exit; > + } > + > + ctx->trtt_info.vma = vma; > + ctx->trtt_info.null_tile_val = trtt_params.null_tile_val; > + ctx->trtt_info.invd_tile_val = trtt_params.invd_tile_val; > + ctx->trtt_info.l3_table_address = trtt_params.l3_table_address; > + ctx->trtt_info.segment_base_addr = > trtt_params.segment_base_addr; > + > + ret = intel_lr_rcs_context_setup_trtt(ctx); > + if (ret) { > + intel_trtt_context_destroy_vma(ctx->trtt_info.vma); > + goto exit; > + } > + > + ctx->flags |= CONTEXT_USE_TRTT; > + > +exit: > + return ret; > +} > + > static inline int > mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags) { @@ - > 931,7 +1064,14 @@ int i915_gem_context_getparam_ioctl(struct drm_device > *dev, void *data, > return PTR_ERR(ctx); > } > > - args->size = 0; > + /* > + * Take a reference also, as in certain cases we have to release & > + * reacquire the struct_mutex and we don't want the context to > + * go away. > + */ > + i915_gem_context_reference(ctx); > + > + args->size = (args->param != I915_CONTEXT_PARAM_TRTT) ? 0 : > +args->size; > switch (args->param) { > case I915_CONTEXT_PARAM_BAN_PERIOD: > args->value = ctx->hang_stats.ban_period_seconds; > @@ -947,10 +1087,14 @@ int i915_gem_context_getparam_ioctl(struct > drm_device *dev, void *data, > else > args->value = to_i915(dev)->ggtt.base.total; > break; > + case I915_CONTEXT_PARAM_TRTT: > + ret = intel_context_get_trtt(ctx, args); > + break; > default: > ret = -EINVAL; > break; > } > + i915_gem_context_unreference(ctx); > mutex_unlock(&dev->struct_mutex); > > return ret; > @@ -974,6 +1118,13 @@ int i915_gem_context_setparam_ioctl(struct > drm_device *dev, void *data, > return PTR_ERR(ctx); > } > > + /* > + * Take a reference also, as in certain cases we have to release & > + * reacquire the struct_mutex and we don't want the context to > + * go away. > + */ > + i915_gem_context_reference(ctx); > + > switch (args->param) { > case I915_CONTEXT_PARAM_BAN_PERIOD: > if (args->size) > @@ -992,10 +1143,14 @@ int i915_gem_context_setparam_ioctl(struct > drm_device *dev, void *data, > ctx->flags |= args->value ? CONTEXT_NO_ZEROMAP : > 0; > } > break; > + case I915_CONTEXT_PARAM_TRTT: > + ret = intel_context_set_trtt(ctx, args); > + break; > default: > ret = -EINVAL; > break; > } > + i915_gem_context_unreference(ctx); > mutex_unlock(&dev->struct_mutex); > > return ret; > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c > b/drivers/gpu/drm/i915/i915_gem_gtt.c > index 0715bb7..cbf8a03 100644 > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c > @@ -2169,6 +2169,17 @@ int i915_ppgtt_init_hw(struct drm_device *dev) { > gtt_write_workarounds(dev); > > + if (HAS_TRTT(dev) && USES_FULL_48BIT_PPGTT(dev)) { > + struct drm_i915_private *dev_priv = dev->dev_private; > + /* > + * Globally enable TR-TT support in Hw. > + * Still TR-TT enabling on per context basis is required. > + * Non-trtt contexts are not affected by this setting. > + */ > + I915_WRITE(GEN9_TR_CHICKEN_BIT_VECTOR, > + GEN9_TRTT_BYPASS_DISABLE); > + } > + > /* In the case of execlists, PPGTT is enabled by the context > descriptor > * and the PDPs are contained within the context itself. We don't > * need to do anything here. */ > @@ -3362,6 +3373,60 @@ > i915_gem_obj_lookup_or_create_ggtt_vma(struct drm_i915_gem_object > *obj, > > } > > +void intel_trtt_context_destroy_vma(struct i915_vma *vma) { > + struct i915_address_space *vm = vma->vm; > + > + WARN_ON(!list_empty(&vma->obj_link)); > + WARN_ON(!list_empty(&vma->vm_link)); > + WARN_ON(!list_empty(&vma->exec_list)); > + > + WARN_ON(!vma->pin_count); > + > + if (drm_mm_node_allocated(&vma->node)) > + drm_mm_remove_node(&vma->node); > + > + i915_ppgtt_put(i915_vm_to_ppgtt(vm)); > + kmem_cache_free(to_i915(vm->dev)->vmas, vma); } > + > +struct i915_vma * > +intel_trtt_context_allocate_vma(struct i915_address_space *vm, > + uint64_t segment_base_addr) > +{ > + struct i915_vma *vma; > + int ret; > + > + vma = kmem_cache_zalloc(to_i915(vm->dev)->vmas, GFP_KERNEL); > + if (!vma) > + return ERR_PTR(-ENOMEM); > + > + INIT_LIST_HEAD(&vma->obj_link); > + INIT_LIST_HEAD(&vma->vm_link); > + INIT_LIST_HEAD(&vma->exec_list); > + vma->vm = vm; > + i915_ppgtt_get(i915_vm_to_ppgtt(vm)); > + > + /* Mark the vma as permanently pinned */ > + vma->pin_count = 1; > + > + /* Reserve from the 48 bit PPGTT space */ > + vma->node.start = segment_base_addr; > + vma->node.size = GEN9_TRTT_SEGMENT_SIZE; > + ret = drm_mm_reserve_node(&vm->mm, &vma->node); > + if (ret) { > + ret = i915_gem_evict_for_vma(vma); > + if (ret == 0) > + ret = drm_mm_reserve_node(&vm->mm, &vma- > >node); > + } > + if (ret) { > + intel_trtt_context_destroy_vma(vma); > + return ERR_PTR(ret); > + } > + > + return vma; > +} > + > static struct scatterlist * > rotate_pages(const dma_addr_t *in, unsigned int offset, > unsigned int width, unsigned int height, diff --git > a/drivers/gpu/drm/i915/i915_gem_gtt.h > b/drivers/gpu/drm/i915/i915_gem_gtt.h > index d804be0..8cbaca2 100644 > --- a/drivers/gpu/drm/i915/i915_gem_gtt.h > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h > @@ -128,6 +128,10 @@ typedef uint64_t gen8_ppgtt_pml4e_t; > #define GEN8_PPAT_ELLC_OVERRIDE (0<<2) > #define GEN8_PPAT(i, x) ((uint64_t) (x) << ((i) * 8)) > > +/* Fixed size segment */ > +#define GEN9_TRTT_SEG_SIZE_SHIFT 44 > +#define GEN9_TRTT_SEGMENT_SIZE (1ULL << > GEN9_TRTT_SEG_SIZE_SHIFT) > + > enum i915_ggtt_view_type { > I915_GGTT_VIEW_NORMAL = 0, > I915_GGTT_VIEW_ROTATED, > @@ -560,4 +564,8 @@ size_t > i915_ggtt_view_size(struct drm_i915_gem_object *obj, > const struct i915_ggtt_view *view); > > +struct i915_vma * > +intel_trtt_context_allocate_vma(struct i915_address_space *vm, > + uint64_t segment_base_addr); > +void intel_trtt_context_destroy_vma(struct i915_vma *vma); > #endif > diff --git a/drivers/gpu/drm/i915/i915_reg.h > b/drivers/gpu/drm/i915/i915_reg.h index 264885f..07936b6 100644 > --- a/drivers/gpu/drm/i915/i915_reg.h > +++ b/drivers/gpu/drm/i915/i915_reg.h > @@ -188,6 +188,25 @@ static inline bool i915_mmio_reg_valid(i915_reg_t > reg) > #define GEN8_RPCS_EU_MIN_SHIFT 0 > #define GEN8_RPCS_EU_MIN_MASK (0xf << > GEN8_RPCS_EU_MIN_SHIFT) > > +#define GEN9_TR_CHICKEN_BIT_VECTOR _MMIO(0x4DFC) > +#define GEN9_TRTT_BYPASS_DISABLE (1 << 0) > + > +/* TRTT registers in the H/W Context */ > +#define GEN9_TRTT_L3_POINTER_DW0 _MMIO(0x4DE0) > +#define GEN9_TRTT_L3_POINTER_DW1 _MMIO(0x4DE4) > +#define GEN9_TRTT_L3_GFXADDR_MASK 0xFFFFFFFF0000 > + > +#define GEN9_TRTT_NULL_TILE_REG _MMIO(0x4DE8) > +#define GEN9_TRTT_INVD_TILE_REG _MMIO(0x4DEC) > + > +#define GEN9_TRTT_VA_MASKDATA _MMIO(0x4DF0) > +#define GEN9_TRVA_MASK_VALUE 0xF0 > +#define GEN9_TRVA_DATA_MASK 0xF > + > +#define GEN9_TRTT_TABLE_CONTROL _MMIO(0x4DF4) > +#define GEN9_TRTT_IN_GFX_VA_SPACE (1 << 1) > +#define GEN9_TRTT_ENABLE (1 << 0) > + > #define GAM_ECOCHK _MMIO(0x4090) > #define BDW_DISABLE_HDC_INVALIDATION (1<<25) > #define ECOCHK_SNB_BIT (1<<10) > diff --git a/drivers/gpu/drm/i915/intel_lrc.c > b/drivers/gpu/drm/i915/intel_lrc.c > index 3a23b95..8af480b 100644 > --- a/drivers/gpu/drm/i915/intel_lrc.c > +++ b/drivers/gpu/drm/i915/intel_lrc.c > @@ -1645,6 +1645,76 @@ static int gen9_init_render_ring(struct > intel_engine_cs *engine) > return init_workarounds_ring(engine); > } > > +static int gen9_init_rcs_context_trtt(struct drm_i915_gem_request *req) > +{ > + struct intel_ringbuffer *ringbuf = req->ringbuf; > + int ret; > + > + ret = intel_logical_ring_begin(req, 2 + 2); > + if (ret) > + return ret; > + > + intel_logical_ring_emit(ringbuf, MI_LOAD_REGISTER_IMM(1)); > + > + intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_TABLE_CONTROL); > + intel_logical_ring_emit(ringbuf, 0); > + > + intel_logical_ring_emit(ringbuf, MI_NOOP); > + intel_logical_ring_advance(ringbuf); > + > + return 0; > +} > + > +static int gen9_emit_trtt_regs(struct drm_i915_gem_request *req) { > + struct intel_context *ctx = req->ctx; > + struct intel_ringbuffer *ringbuf = req->ringbuf; > + u64 masked_l3_gfx_address = > + ctx->trtt_info.l3_table_address & > GEN9_TRTT_L3_GFXADDR_MASK; > + u32 trva_data_value = > + (ctx->trtt_info.segment_base_addr >> > GEN9_TRTT_SEG_SIZE_SHIFT) & > + GEN9_TRVA_DATA_MASK; > + const int num_lri_cmds = 6; > + int ret; > + > + /* > + * Emitting LRIs to update the TRTT registers is most reliable, instead > + * of directly updating the context image, as this will ensure that > + * update happens in a serialized manner for the context and also > + * lite-restore scenario will get handled. > + */ > + ret = intel_logical_ring_begin(req, num_lri_cmds * 2 + 2); > + if (ret) > + return ret; > + > + intel_logical_ring_emit(ringbuf, > MI_LOAD_REGISTER_IMM(num_lri_cmds)); > + > + intel_logical_ring_emit_reg(ringbuf, > GEN9_TRTT_L3_POINTER_DW0); > + intel_logical_ring_emit(ringbuf, > +lower_32_bits(masked_l3_gfx_address)); > + > + intel_logical_ring_emit_reg(ringbuf, > GEN9_TRTT_L3_POINTER_DW1); > + intel_logical_ring_emit(ringbuf, > +upper_32_bits(masked_l3_gfx_address)); > + > + intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_NULL_TILE_REG); > + intel_logical_ring_emit(ringbuf, ctx->trtt_info.null_tile_val); > + > + intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_INVD_TILE_REG); > + intel_logical_ring_emit(ringbuf, ctx->trtt_info.invd_tile_val); > + > + intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_VA_MASKDATA); > + intel_logical_ring_emit(ringbuf, > + GEN9_TRVA_MASK_VALUE | > trva_data_value); > + > + intel_logical_ring_emit_reg(ringbuf, GEN9_TRTT_TABLE_CONTROL); > + intel_logical_ring_emit(ringbuf, > + GEN9_TRTT_IN_GFX_VA_SPACE | > GEN9_TRTT_ENABLE); > + > + intel_logical_ring_emit(ringbuf, MI_NOOP); > + intel_logical_ring_advance(ringbuf); > + > + return 0; > +} > + > static int intel_logical_ring_emit_pdps(struct drm_i915_gem_request *req) > { > struct i915_hw_ppgtt *ppgtt = req->ctx->ppgtt; @@ -2003,6 > +2073,25 @@ static int gen8_init_rcs_context(struct drm_i915_gem_request > *req) > return intel_lr_context_render_state_init(req); > } > > +static int gen9_init_rcs_context(struct drm_i915_gem_request *req) { > + int ret; > + > + /* > + * Explictily disable TR-TT at the start of a new context. > + * Otherwise on switching from a TR-TT context to a new Non TR-TT > + * context the TR-TT settings of the outgoing context could get > + * spilled on to the new incoming context as only the Ring Context > + * part is loaded on the first submission of a new context, due to > + * the setting of ENGINE_CTX_RESTORE_INHIBIT bit. > + */ > + ret = gen9_init_rcs_context_trtt(req); > + if (ret) > + return ret; > + > + return gen8_init_rcs_context(req); > +} > + > /** > * intel_logical_ring_cleanup() - deallocate the Engine Command Streamer > * > @@ -2134,11 +2223,14 @@ static int logical_render_ring_init(struct > drm_device *dev) > logical_ring_default_vfuncs(dev, engine); > > /* Override some for render ring. */ > - if (INTEL_INFO(dev)->gen >= 9) > + if (INTEL_INFO(dev)->gen >= 9) { > engine->init_hw = gen9_init_render_ring; > - else > + engine->init_context = gen9_init_rcs_context; > + } else { > engine->init_hw = gen8_init_render_ring; > - engine->init_context = gen8_init_rcs_context; > + engine->init_context = gen8_init_rcs_context; > + } > + > engine->cleanup = intel_fini_pipe_control; > engine->emit_flush = gen8_emit_flush_render; > engine->emit_request = gen8_emit_request_render; @@ -2702,3 > +2794,29 @@ void intel_lr_context_reset(struct drm_device *dev, > ringbuf->tail = 0; > } > } > + > +int intel_lr_rcs_context_setup_trtt(struct intel_context *ctx) { > + struct intel_engine_cs *engine = &(ctx->i915->engine[RCS]); > + struct drm_i915_gem_request *req; > + int ret; > + > + if (!ctx->engine[RCS].state) { > + ret = intel_lr_context_deferred_alloc(ctx, engine); > + if (ret) > + return ret; > + } > + > + req = i915_gem_request_alloc(engine, ctx); > + if (IS_ERR(req)) > + return PTR_ERR(req); > + > + ret = gen9_emit_trtt_regs(req); > + if (ret) { > + i915_gem_request_cancel(req); > + return ret; > + } > + > + i915_add_request(req); > + return 0; > +} > diff --git a/drivers/gpu/drm/i915/intel_lrc.h > b/drivers/gpu/drm/i915/intel_lrc.h > index a17cb12..f3600b2 100644 > --- a/drivers/gpu/drm/i915/intel_lrc.h > +++ b/drivers/gpu/drm/i915/intel_lrc.h > @@ -107,6 +107,7 @@ void intel_lr_context_reset(struct drm_device *dev, > struct intel_context *ctx); > uint64_t intel_lr_context_descriptor(struct intel_context *ctx, > struct intel_engine_cs *engine); > +int intel_lr_rcs_context_setup_trtt(struct intel_context *ctx); > > u32 intel_execlists_ctx_id(struct intel_context *ctx, > struct intel_engine_cs *engine); > diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h > index a5524cc..604da23 100644 > --- a/include/uapi/drm/i915_drm.h > +++ b/include/uapi/drm/i915_drm.h > @@ -1167,7 +1167,15 @@ struct drm_i915_gem_context_param { > #define I915_CONTEXT_PARAM_BAN_PERIOD 0x1 > #define I915_CONTEXT_PARAM_NO_ZEROMAP 0x2 > #define I915_CONTEXT_PARAM_GTT_SIZE 0x3 > +#define I915_CONTEXT_PARAM_TRTT 0x4 > __u64 value; > }; > > +struct drm_i915_gem_context_trtt_param { > + __u64 segment_base_addr; > + __u64 l3_table_address; > + __u32 invd_tile_val; > + __u32 null_tile_val; > +}; > + > #endif /* _UAPI_I915_DRM_H_ */ > -- > 1.9.2 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@xxxxxxxxxxxxxxxxxxxxx > https://lists.freedesktop.org/mailman/listinfo/intel-gfx _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx