On Tue, 23 Oct 2012 07:57:12 -0700 Ben Widawsky <ben at bwidawsk.net> wrote: > On 2012-10-23 02:59, Chris Wilson wrote: > > On Mon, 22 Oct 2012 18:34:11 -0700, Ben Widawsky <ben at bwidawsk.net> > > wrote: > >> +/* > >> + * Binds an object into the global gtt with the specified cache > >> level. The object > >> + * will be accessible to the GPU via commands whose operands > >> reference offsets > >> + * within the global GTT as well as accessible by the GPU through > >> the GMADR > >> + * mapped BAR (dev_priv->mm.gtt->gtt). > >> + */ > >> +static void gen6_ggtt_bind_object(struct drm_i915_gem_object *obj, > >> + enum i915_cache_level level) > >> +{ > >> + struct drm_device *dev = obj->base.dev; > >> + struct drm_i915_private *dev_priv = dev->dev_private; > >> + struct sg_table *st = obj->pages; > >> + struct scatterlist *sg = st->sgl; > >> + const int first_entry = obj->gtt_space->start >> PAGE_SHIFT; > >> + const int max_entries = dev_priv->mm.gtt->gtt_total_entries - > >> first_entry; > >> + gtt_pte_t __iomem *gtt_entries = dev_priv->mm.gtt->gtt + > >> first_entry; > >> + int unused, i = 0; > >> + unsigned int len, m = 0; > >> + > >> + for_each_sg(st->sgl, sg, st->nents, unused) { > >> + len = sg_dma_len(sg) >> PAGE_SHIFT; > >> + for (m = 0; m < len; m++) { > >> + dma_addr_t addr = sg_dma_address(sg) + (m << PAGE_SHIFT); > >> + gtt_entries[i] = pte_encode(dev, addr, level); > >> + i++; > >> + if (WARN_ON(i > max_entries)) > >> + goto out; > >> + } > >> + } > >> + > >> +out: > >> + /* XXX: This serves as a posting read preserving the way the old > >> code > >> + * works. It's not clear if this is strictly necessary or just > >> voodoo > >> + * based on what I've tried to gather from the docs. > >> + */ > >> + readl(>t_entries[i-1]); > > > > It will be required until we replace the voodoo with more explicit > > mb(). > > -Chris > > Actually, after we introduce the FLSH_CNTL patch from Jesse/me later in > the series, I think we just want a POSTING_READ on that register. It is > technically "required" by our desire to some day WC the registers, and > should synchronize everything else for us. > > After a quick read of memory_barriers.txt (again), I think mmiowb is > actually what we might want in addition to the POSTING_READ I'd add. On a big NUMA system maybe (i.e. on nothing we run on yet), but on x86 mmiowb doesn't do anything other than act as a compiler optimization barrier. -- Jesse Barnes, Intel Open Source Technology Center