On Thu, Jul 28, 2016 at 09:49:58PM +0100, Chris Wilson wrote: > On Thu, Jul 28, 2016 at 12:23:40PM +0200, Daniel Vetter wrote: > > I think we have a race here still: The issue is that the > > kref_get_unless_zero is an unordered atomic, and the rcu_dereference is > > only an smb_read_barrier_depends, which doesn't prevent the fetch from > > happening before the atomic_add_unless. > > > > Well until I opened memory-barriers.txt and learned that atomic_add_unless > > is a full smp_mb() on both sides on success. That's a bit too tricky for > > my taste, what about the following comment: > > > > /* When request_get_rcu succeds the underlying > > * atomic_add_unless has a full smp_mb() on both sides. > > * This ensures that the rcu_dereference() below can't be > > * reordered before the the refcounting increase has > > * happened, which prevents the request from being reused. > > */ > > > > I couldn't poke any other holes into this, and we're reusing the fence rcu > > functions where appropriate. With the comment: > I guess it doesn't hurt to make this really, really clear. Perfect! Well almost, one nit: > > /* What stops the following rcu_dereference() from occuring > * before the above i915_gem_request_get_rcu()? If we were > * to read the value before pausing to get the reference to > * the request, we may not notice a change in the active > * tracker. > * > * The rcu_dereference() is a mere read barrier, which means s/read barrier/barrier of depending reads/, rcu_dereference is not even a full rmb! > * that operations after it will appear after, neither the hence also: s/operations/any operations through the read pointer/ Aside: I'm always impressive how alpha managed to mispredict dependent reads somehow ... "sorry my magic 8ball had a glitch"!? -Daniel > * CPU nor the compiler will bring them forwards. However, > * that does not restrict the rcu_dereference() itself. The > * read may be performed earlier by an out-of-order CPU, or > * adventurous compiler. > * > * The atomic operation at the heart of > * i915_gem_request_get_rcu(), see fence_get_rcu(), is > * atomic_inc_not_zero() which is only a full memory barrier > * when succesful. That is, if i915_gem_request_get_rcu() > * returns the request (and so with the reference counted > * incremented) then the following read for rcu_dereference() > * must occur after the atomic operation and so confirm > * that this request is the one currently being tracked. > */ -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx