On Sat, Aug 06, 2016 at 11:26:22AM +0100, Chris Wilson wrote: > On Fri, Aug 05, 2016 at 10:13:22PM +0100, Chris Wilson wrote: > > In the debate as to whether the second read of active->request is > > ordered after the dependent reads of the first read of active->request, > > just give in and throw a smp_rmb() in there so that ordering of loads is > > assured. > > > > v2: Explain the manual smp_rmb() > > > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > Cc: Daniel Vetter <daniel.vetter@xxxxxxxx> > > Reviewed-by: Daniel Vetter <daniel.vetter@xxxxxxxx> > > --- > > drivers/gpu/drm/i915/i915_gem.c | 25 ++++++++++++++++++++----- > > drivers/gpu/drm/i915/i915_gem_request.h | 3 +++ > > 2 files changed, 23 insertions(+), 5 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > > index f4f8eaa90f2a..654f0b015f97 100644 > > --- a/drivers/gpu/drm/i915/i915_gem.c > > +++ b/drivers/gpu/drm/i915/i915_gem.c > > @@ -3735,7 +3735,7 @@ i915_gem_object_ggtt_unpin_view(struct drm_i915_gem_object *obj, > > i915_vma_unpin(i915_gem_obj_to_ggtt_view(obj, view)); > > } > > > > -static __always_inline unsigned __busy_read_flag(unsigned int id) > > +static __always_inline unsigned int __busy_read_flag(unsigned int id) > > { > > /* Note that we could alias engines in the execbuf API, but > > * that would be very unwise as it prevents userspace from > > @@ -3753,7 +3753,7 @@ static __always_inline unsigned int __busy_write_id(unsigned int id) > > return id; > > } > > > > -static __always_inline unsigned > > +static __always_inline unsigned int > > __busy_set_if_active(const struct i915_gem_active *active, > > unsigned int (*flag)(unsigned int id)) > > { > > @@ -3770,19 +3770,34 @@ __busy_set_if_active(const struct i915_gem_active *active, > > > > id = request->engine->exec_id; > > > > - /* Check that the pointer wasn't reassigned and overwritten. */ > > + /* Check that the pointer wasn't reassigned and overwritten. > > + * > > + * In __i915_gem_active_get_rcu(), we enforce ordering between > > + * the first rcu pointer dereference (imposing a > > + * read-dependency only on access through the pointer) and > > + * the second lockless access through the memory barrier > > + * following a successful atomic_inc_not_zero(). Here there > > + * is no such barrier, and so we must manually insert an > > + * explicit read barrier to ensure that the following > > + * access occurs after all the loads through the first > > + * pointer. > > + * > > + * The corresponding write barrier is part of > > + * rcu_assign_pointer(). > > + */ > > + smp_rmb(); > > Are you sure this should not just be a read_barrier_depends()? > > active->request is data dependent on the earlier reads through it, and > here we are only caring that those loads are completed before we double > check the request hasn't been overwritten. There's no data depency between loading request->engine->exec_id and (re)loading active->request. I think full smp_rmb it needs to be. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx