On Mon, Jul 15, 2013 at 08:35:43PM -0700, Ben Widawsky wrote: > On Sat, Jul 13, 2013 at 11:33:22AM +0200, Daniel Vetter wrote: > > On Fri, Jul 12, 2013 at 09:45:54PM -0700, Ben Widawsky wrote: > > > As we plumb the code with more VM information, it has become more > > > obvious that the easiest way to deal with bind and unbind is to simply > > > put the function pointers in the vm, and let those choose the correct > > > way to handle the page table updates. This change allows many places in > > > the code to simply be vm->bind, and not have to worry about > > > distinguishing PPGTT vs GGTT. > > > > > > NOTE: At some point in the future, brining back insert_entries may in > > > fact be desirable in order to use 1 bind/unbind for multiple generations > > > of PPGTT. For now however, it's just not necessary. > > > > I need to check the -internal tree again, but I'm rather sure that we need > > ->insert_entries. In that case I don't want to remove it here in the > > upstream tree since I have no intention to carry the re-add patch in > > -internal ;-) > > We do use it for i915_ppgtt_bind_object(), however it should be easily > fixable since the mini-series is exactly about removing > i915_ppgtt_bind_object, and making into vm->bind_object. I think it's > fair if you ask me to fix this up on -internal as well, before merging > it, but with that one exception - I still believe this is the right > direction to go in. > > > > > > > > > Signed-off-by: Ben Widawsky <ben at bwidawsk.net> > > > --- > > > drivers/gpu/drm/i915/i915_drv.h | 9 +++++ > > > drivers/gpu/drm/i915/i915_gem_gtt.c | 72 +++++++++++++++++++++++++++++++++++++ > > > 2 files changed, 81 insertions(+) > > > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > > > index e6694ae..c2a9c98 100644 > > > --- a/drivers/gpu/drm/i915/i915_drv.h > > > +++ b/drivers/gpu/drm/i915/i915_drv.h > > > @@ -484,9 +484,18 @@ struct i915_address_space { > > > /* FIXME: Need a more generic return type */ > > > gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr, > > > enum i915_cache_level level); > > > + > > > + /** Unmap an object from an address space. This usually consists of > > > + * setting the valid PTE entries to a reserved scratch page. */ > > > + void (*unbind_object)(struct i915_address_space *vm, > > > + struct drm_i915_gem_object *obj); > > > > void (*unbind_vma)(struct i915_vma *vma); > > void (*bind_vma)(struct i915_vma *vma, > > enum i915_cache_level cache_level); > > > > I think if you do this as a follow-up we might as well bikeshed the > > interface a bit. Again (I know, broken record) for me it feels > > semantically much cleaner to talk about binding/unbindinig a vma instead > > of an (obj, vm) pair ... > > > > So as mentioned (and I haven't yet responded to the other email, but > I'll be broken record there also) - I don't disagree with you. My > argument is the performance difference should be negligible, and the code > as is, is decently tested. Changing this requires changing so much, I'd > rather do the conversion on top. See the other mail thread for more... > > > > void (*clear_range)(struct i915_address_space *vm, > > > unsigned int first_entry, > > > unsigned int num_entries); > > > + /* Map an object into an address space with the given cache flags. */ > > > + void (*bind_object)(struct i915_address_space *vm, > > > + struct drm_i915_gem_object *obj, > > > + enum i915_cache_level cache_level); > > > void (*insert_entries)(struct i915_address_space *vm, > > > struct sg_table *st, > > > unsigned int first_entry, > > > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c > > > index c0d0223..31ff971 100644 > > > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c > > > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c > > > @@ -45,6 +45,12 @@ > > > #define GEN6_PTE_CACHE_LLC_MLC (3 << 1) > > > #define GEN6_PTE_ADDR_ENCODE(addr) GEN6_GTT_ADDR_ENCODE(addr) > > > > > > +static void gen6_ppgtt_bind_object(struct i915_address_space *vm, > > > + struct drm_i915_gem_object *obj, > > > + enum i915_cache_level cache_level); > > > +static void gen6_ppgtt_unbind_object(struct i915_address_space *vm, > > > + struct drm_i915_gem_object *obj); > > > + > > > static gen6_gtt_pte_t gen6_pte_encode(dma_addr_t addr, > > > enum i915_cache_level level) > > > { > > > @@ -285,7 +291,9 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt) > > > } > > > ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES; > > > ppgtt->enable = gen6_ppgtt_enable; > > > + ppgtt->base.unbind_object = gen6_ppgtt_unbind_object; > > > ppgtt->base.clear_range = gen6_ppgtt_clear_range; > > > + ppgtt->base.bind_object = gen6_ppgtt_bind_object; > > > ppgtt->base.insert_entries = gen6_ppgtt_insert_entries; > > > ppgtt->base.cleanup = gen6_ppgtt_cleanup; > > > ppgtt->base.scratch = dev_priv->gtt.base.scratch; > > > @@ -397,6 +405,17 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt, > > > cache_level); > > > } > > > > > > +static void gen6_ppgtt_bind_object(struct i915_address_space *vm, > > > + struct drm_i915_gem_object *obj, > > > + enum i915_cache_level cache_level) > > > +{ > > > + const unsigned long entry = i915_gem_obj_offset(obj, vm); > > > + > > > + gen6_ppgtt_insert_entries(vm, obj->pages, entry >> PAGE_SHIFT, > > > + cache_level); > > > + obj->has_aliasing_ppgtt_mapping = 1; > > > > Since this is the bind function for ppgtt the aliasing ppgtt stuff looks a > > bit wrong here. Either we do the ppgtt insert_entries call as part of the > > global gtt bind call (if vm->aliasing_ppgtt is set) or we have a special > > global gtt binding call for execbuf. > > > > Thinking about this some more we might need bind flags with > > > > #define VMA_BIND_CPU (1<<0) /* ensure ggtt mapping exists for aliasing ppgtt */ > > #define VMA_BIND_GPU (1<<1) /* ensure ppgtt mappings exists for aliasing ppgtt */ > > > > since otherwise we can't properly encapsulate the aliasing ppgtt binding > > logic into vm->bind. So in the end we'd have > > > > void ggtt_bind_vma(vma, bind_flags, cache_level) > > { > > ggtt_vm = vma->vm; > > WARN_ON(ggtt_vm != &dev_priv->gtt.base); > > > > if ((!ggtt_vm->aliasing_ppgtt || (bind_flags & BIND_CPU)) && > > !obj->has_global_gtt_mapping) { > > ggtt_vm->insert_entries(vma->obj, vma->node.start, cache_leve); > > vma->obj->has_global_gtt_mapping = true; > > } > > > > if ((ggtt_vm->aliasing_ppgtt && (bind_flags & BIND_GPU)) && > > !obj->has_ppgtt_mapping) { > > ggtt_vm->aliasing_ppgtt->insert_entries(vma->obj, > > vma->node.start, > > cache_leve); > > vma->obj->has_ppgtt_mapping = true; > > } > > } > > > > Obviously completely untested, but I hope I could get the idea accross. > > > > Cheers, Daniel > > To me, aliasing ppgtt is just a wart that doesn't fit well with > anything. As such, my plan was to hide as much of it as possible in ggtt > functions. Using some kind of flag on ggtt_bind() we can determine if > the user actually wants ggtt, and if so bind to both, else just use > aliasing ppgtt. None of that code appears here because I want to make > the diff churn as small as possible, and hadn't completely thought it > all through. > > Now after typing that (and this really did happen), I just looked at > your function, and it seems to be more or less exactly what I just > typed. Cool! The GPU/CPU naming scheme seems off to me, and I think you > really just want one flag which specifies "bind it in the global gtt, > sucka" > > Now having just typed /that/, it was indeed my plan. So as long as > nothing really bothers you with the bind/unbind() stuff, I can move > forward with a patch on top to fix it. > I changed my mind already. A patch on top doesn't make sense. I'll try to fix this one up as is. -- Ben Widawsky, Intel Open Source Technology Center