On Tue, 9 Oct 2012 11:54:12 +0200, Daniel Vetter <daniel at ffwll.ch> wrote: > On Tue, Oct 09, 2012 at 09:38:58AM +0100, Chris Wilson wrote: > > @@ -3244,6 +3254,9 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write) > > > > i915_gem_object_flush_cpu_write_domain(obj); > > > > + if ((obj->base.read_domains & I915_GEM_DOMAIN_GTT) == 0) > > + mb(); > > + > > I think a comment here like we have one for all other gtt related memory > barries would be good. Another thing is the flush_gtt_write_domain uses a > wmb, whereas here we don't bother with micro-optimizing things. So I think > it'd be good to just use a mb() for that, too, if just for consistency. In fact, not only is that the wmb() the easiest to micro-optimise, it is the only one that can be - I think. Around manipulating the fence, we need a read/write barrier in case we have any pending accesses through the fenced region, since the register write may be reordered passed the memory reads since there is no obvious dependency. That might just be heightened paranoia and our memory controller isn't that smart. Yet. So those two need to be mb() so that I can sleep safely at night. For the mb() inside set-to-gtt-domain, I don't have a robust explanation other than that empirically we need a barrier, therefore there is some lingering incoherency when reusing a bo. (The hangs always seem to occur when crossing a page boundary, we see stale data.) You could attempt to insert a read/write barrier depending upon actual usage, but it hardly seems worth the effort. > Also, you know the grumpy maintainer drill: Could we exercise these > barriers with a minimal i-g-t testcase, please? Since you've managed to > kill your machine by removing them, they're no longer just there to keep > us happy, hence I'd like to have them exercised ... Still hunting. > Another thing that just crossed my mind: Could we lack a set of mb()s for > cpu access on llc platforms? For non-coherent platforms the mb() in the > clflush paths will do that, but on llc platforms I couldn't find anything. > And that lp bugs seems to make an excellent case for them being required > ... Yes, with LLC we need to treat the GPU as another core and so put similar SMP-esque memory barriers in place. -Chris -- Chris Wilson, Intel Open Source Technology Centre