On Sat, Jan 09, 2016 at 05:00:21PM +0530, akash.goel@xxxxxxxxx wrote: > From: Akash Goel <akash.goel@xxxxxxxxx> > > Gen9 has an additional address translation hardware support in form of > Tiled Resource Translation Table (TR-TT) which provides an extra level > of abstraction over PPGTT. > This is useful for mapping Sparse/Tiled texture resources. > Sparse resources are created as virtual-only allocations. Regions of the > resource that the application intends to use is bound to the physical memory > on the fly and can be re-bound to different memory allocations over the > lifetime of the resource. > > TR-TT is tightly coupled with PPGTT, a new instance of TR-TT will be required > for a new PPGTT instance, but TR-TT may not enabled for every context. > 1/16th of the 48bit PPGTT space is earmarked for the translation by TR-TT, > which such chunk to use is conveyed to HW through a register. > Any GFX address, which lies in that reserved 44 bit range will be translated > through TR-TT first and then through PPGTT to get the actual physical address, > so the output of translation from TR-TT will be a PPGTT offset. > > TRTT is constructed as a 3 level tile Table. Each tile is 64KB is size which > leaves behind 44-16=28 address bits. 28bits are partitioned as 9+9+10, and > each level is contained within a 4KB page hence L3 and L2 is composed of > 512 64b entries and L1 is composed of 1024 32b entries. > > There is a provision to keep TR-TT Tables in virtual space, where the pages of > TRTT tables will be mapped to PPGTT. > Currently this is the supported mode, in this mode UMD will have a full control > on TR-TT management, with bare minimum support from KMD. > So the entries of L3 table will contain the PPGTT offset of L2 Table pages, > similarly entries of L2 table will contain the PPGTT offset of L1 Table pages. > The entries of L1 table will contain the PPGTT offset of BOs actually backing > the Sparse resources. > The assumption here is that UMD only will do the complete PPGTT address space > management and use the Soft Pin API for all the buffer objects associated with > a given Context. That is a poor assumption, and not one required for this to work. > So UMD will also have to allocate the L3/L2/L1 table pages > as a regular GEM BO only & assign them a PPGTT address through the Soft Pin API. > UMD would have to emit the MI_STORE_DATA_IMM commands in the batch buffer to > program the relevant entries of L3/L2/L1 tables. This only applies to te TR-TT L1-L3 cache, right? > Any space in TR-TT segment not bound to any Sparse texture, will be handled > through Invalid tile, User is expected to initialize the entries of a new > L3/L2/L1 table page with the Invalid tile pattern. The entries corresponding to > the holes in the Sparse texture resource will be set with the Null tile pattern > The improper programming of TRTT should only lead to a recoverable GPU hang, > eventually leading to banning of the culprit context without victimizing others. > > The association of any Sparse resource with the BOs will be known only to UMD, > and only the Sparse resources shall be assigned an offset from the TR-TT segment > by UMD. The use of TR-TT segment or mapping of Sparse resources will be > abstracted from the KMD, s/abstracted from/transparent to/ s/,/;/ > UMD can do the address assignment from TR-TT segment s/can/will/ > autonomously and KMD will be oblivious of it. > The BOs must not be assigned an address from TR-TT segment, they will be mapped s/The BOs/Any object/ > to PPGTT in a regular way by KMD s/using the Soft Pin offset provided by UMD// as this is irrelevant. > This patch provides an interface through which UMD can convey KMD to enable > TR-TT for a given context. A new I915_CONTEXT_PARAM_ENABLE_TRTT param has been > added to I915_GEM_CONTEXT_SETPARAM ioctl for that purpose. > UMD will have to pass the GFX address of L3 table page, +along with the > pattern value for the > Null & invalid Tile registers. > > Testcase: igt/gem_trtt > > Signed-off-by: Akash Goel <akash.goel@xxxxxxxxx> > --- > drivers/gpu/drm/i915/i915_dma.c | 3 ++ > drivers/gpu/drm/i915/i915_drv.h | 12 +++++++ > drivers/gpu/drm/i915/i915_gem_context.c | 45 ++++++++++++++++++++++++++ > drivers/gpu/drm/i915/i915_gem_gtt.c | 57 +++++++++++++++++++++++++++++++++ > drivers/gpu/drm/i915/i915_gem_gtt.h | 6 ++++ > drivers/gpu/drm/i915/i915_reg.h | 19 +++++++++++ > drivers/gpu/drm/i915/intel_lrc.c | 41 ++++++++++++++++++++++++ > include/uapi/drm/i915_drm.h | 8 +++++ > 8 files changed, 191 insertions(+) > > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c > index 988a380..c247c25 100644 > --- a/drivers/gpu/drm/i915/i915_dma.c > +++ b/drivers/gpu/drm/i915/i915_dma.c > @@ -172,6 +172,9 @@ static int i915_getparam(struct drm_device *dev, void *data, > case I915_PARAM_HAS_EXEC_SOFTPIN: > value = 1; > break; > + case I915_PARAM_HAS_TRTT: > + value = HAS_TRTT(dev); > + break; Should we do this here, or just query the context? In fact you are missing the context getparam path any way. > default: > DRM_DEBUG("Unknown parameter %d\n", param->param); > return -EINVAL; > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index c6dd4db..12c612e 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -839,6 +839,7 @@ struct i915_ctx_hang_stats { > #define DEFAULT_CONTEXT_HANDLE 0 > > #define CONTEXT_NO_ZEROMAP (1<<0) > +#define CONTEXT_USE_TRTT (1<<1) Make flags unsigned whilst you are here, and fix the holes! > /** > * struct intel_context - as the name implies, represents a context. > * @ref: reference count. > @@ -881,6 +882,15 @@ struct intel_context { > int pin_count; > } engine[I915_NUM_RINGS]; > > + /* TRTT info */ > + struct { Give this a name now, we will be thankful in the future. > + uint32_t invd_tile_val; > + uint32_t null_tile_val; > + uint64_t l3_table_address; > + struct i915_vma *vma; > + bool update_trtt_params; > + } trtt_info; > + > struct list_head link; > }; > > @@ -2626,6 +2636,8 @@ struct drm_i915_cmd_table { > !IS_VALLEYVIEW(dev) && !IS_CHERRYVIEW(dev) && \ > !IS_BROXTON(dev)) > > +#define HAS_TRTT(dev) (IS_GEN9(dev)) > + > #define INTEL_PCH_DEVICE_ID_MASK 0xff00 > #define INTEL_PCH_IBX_DEVICE_ID_TYPE 0x3b00 > #define INTEL_PCH_CPT_DEVICE_ID_TYPE 0x1c00 > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c > index 900ffd0..ae9fc34 100644 > --- a/drivers/gpu/drm/i915/i915_gem_context.c > +++ b/drivers/gpu/drm/i915/i915_gem_context.c > @@ -146,6 +146,9 @@ static void i915_gem_context_clean(struct intel_context *ctx) > if (WARN_ON(__i915_vma_unbind_no_wait(vma))) > break; > } > + > + if (ctx->flags & CONTEXT_USE_TRTT) > + i915_gem_destroy_trtt_vma(ctx->trtt_info.vma); Sould be in context free. > } > > void i915_gem_context_free(struct kref *ctx_ref) > @@ -512,6 +515,35 @@ i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id) > return ctx; > } > > +static int > +i915_setup_trtt_ctx(struct intel_context *ctx, > + struct drm_i915_gem_context_trtt_param *trtt_params) > +{ > + if (ctx->flags & CONTEXT_USE_TRTT) > + return -EEXIST; > + > + /* basic sanity checks for the l3 table pointer */ > + if ((ctx->trtt_info.l3_table_address >= GEN9_TRTT_SEGMENT_START) && > + (ctx->trtt_info.l3_table_address < > + (GEN9_TRTT_SEGMENT_START + GEN9_TRTT_SEGMENT_SIZE))) Presumably l3_table has an actual size and you want to do a range overlap test, not just the start address. > + return -EINVAL; > + > + if (ctx->trtt_info.l3_table_address & ~GEN9_TRTT_L3_GFXADDR_MASK) > + return -EINVAL; These are worth adding DRM_DEBUG() or even better start using dev_debug() so that we can debug userspace startup issues. > @@ -952,6 +984,7 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data, > { > struct drm_i915_file_private *file_priv = file->driver_priv; > struct drm_i915_gem_context_param *args = data; > + struct drm_i915_gem_context_trtt_param trtt_params; > struct intel_context *ctx; > int ret; > > @@ -983,6 +1016,18 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data, > ctx->flags |= args->value ? CONTEXT_NO_ZEROMAP : 0; > } > break; > + case I915_CONTEXT_PARAM_ENABLE_TRTT: Bump this case to i915_setup_trtt_ctx. i.e. just have ret = i915_setup_trtt_ctx(ctx, args); break; Otherwise this function will become very unwieldly very quickly. > int i915_ppgtt_init_hw(struct drm_device *dev) > { > + if (HAS_TRTT(dev) && USES_FULL_48BIT_PPGTT(dev)) { > + struct drm_i915_private *dev_priv = dev->dev_private; > + > + I915_WRITE(GEN9_TR_CHICKEN_BIT_VECTOR, > + GEN9_TRTT_BYPASS_DISABLE); Shouldn't this be a context specific register? In which case you need to set it in the context image instead. Hmm. given you already do the context image tweaks, how does work with non-trtt contexts? > +struct i915_vma * > +i915_gem_setup_trtt_vma(struct i915_address_space *vm) > +{ > + struct i915_vma *vma; > + int ret; > + > + vma = kmem_cache_zalloc(to_i915(vm->dev)->vmas, GFP_KERNEL); > + if (vma == NULL) > + return ERR_PTR(-ENOMEM); > + > + INIT_LIST_HEAD(&vma->vma_link); > + INIT_LIST_HEAD(&vma->mm_list); > + INIT_LIST_HEAD(&vma->exec_list); > + vma->vm = vm; > + i915_ppgtt_get(i915_vm_to_ppgtt(vm)); Tempted to write a patch to allow vma->vm = i915_ppggtt_get(i915_vm_to_ppgtt(vm)); ? > + /* Mark the vma as perennially pinned */ s/perennially/permanently/ We don't want to lose the reservation as opposed to having it grow back next year. > + vma->pin_count = 1; > + > + /* Reserve from the 48 bit PPGTT space */ > + vma->node.start = GEN9_TRTT_SEGMENT_START; > + vma->node.size = GEN9_TRTT_SEGMENT_SIZE; > + ret = drm_mm_reserve_node(&vm->mm, &vma->node); > + if (ret) { > + ret = i915_gem_evict_for_vma(vma); > + if (ret == 0) > + ret = drm_mm_reserve_node(&vm->mm, &vma->node); Good. I think we want i915_vm_reserve_node(vm, START, SIZE, &vma) - but have a look at the other callsites to see if we have a common interface. Looks like this would improve i915_vgpu. > +struct drm_i915_gem_context_trtt_param { > + __u64 l3_table_address; > + __u32 invd_tile_val; > + __u32 null_tile_val; > +}; Passes the ABI structure sanity checks. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx