Quoting Tvrtko Ursulin (2019-09-27 14:58:24) > > On 27/09/2019 13:32, Chris Wilson wrote: > > Quoting Tvrtko Ursulin (2019-09-27 13:25:23) > >> > >> On 27/09/2019 13:16, Chris Wilson wrote: > >>> Quoting Tvrtko Ursulin (2019-09-27 13:08:51) > >>>> > >>>> On 27/09/2019 12:25, Chris Wilson wrote: > >>>>> Quoting Tvrtko Ursulin (2019-09-27 12:10:29) > >>>>>> > >>>>>> On 25/09/2019 11:01, Chris Wilson wrote: > >>>>>>> +struct dma_fence * > >>>>>>> +__i915_active_fence_set(struct i915_active_fence *active, > >>>>>>> + struct dma_fence *fence) > >>>>>> > >>>>>> Can you add a short comment for this function explaining the racyness > >>>>>> and so why it returns prev and what should the callers do with it? > >>>>> > >>>>> Before using this function, we had the callers declare what mutex guards > >>>>> this timeline and we check here that is held. > >>>> > >>>> No, I mean because it has to reload prev and return it to the caller, > >>>> which implies that is to handle some designed-in racy-ness which I think > >>>> should be mentioned. > >>> > >>> But that's not racing with the caller, that just racing with the > >>> callback from the interrupt handler and the reason why we have to check > >>> after serialising is called out. /* serialise with prev->cb_list */ ? > >>> > >>> The caller is responsible for ensuring that prev is executed before > >>> fence to keep the timeline in the same order as recorded. > >> > >> I did not say it is racing with the caller, but that the caller has to > >> handle a race. > > > > But the caller only has to care about "was there already an active fence > > on this timeline? If so, I must make sure it executes before me and take > > that into consideration for the flow along the timeline" > > > > I'm not seeing what race the caller has to be concerned with here. If > > they replace the last request in the timeline, they have it returned. > > If there was no request previously in the timeline, they have NULL. > > (That just seems straightforward.) > > > >> "Serialise with prev->cb_list" is too low level for me. Trust me, I > >> always struggle with the active tracking code since there is so many > >> indirections and relationships. So in the absence of a visual diagram a > >> higher level comment would be beneficial for the future self. Just > >> expanding on what you replied here talking about how interrupts > >> interacts with new stuff entering tracking would do it. > > > > It's just about that the callback may be executing and so we need to > > serialise the list manipulation under the lock; having performed that > > list manipulation, we then know whether or not we were successful in > > capturing the previous fence. > > Ok ok :), just expand the comment next to re-fetch of prev = > active->fence to that effect, that's all I'm asking. :) > > It will make it easier for future reader to understand why it is > required and under what circumstances could active->fence have became > NULL in there. spin_lock_nested(prev->lock, SINGLE_DEPTH_NESTING); __list_del_entry(&active->cb.node); spin_unlock(prev->lock); /* serialise with prev->cb_list */ + + /* + * active->fence is reset by the callback from inside + * interrupt context. We need to serialise our list + * manipulation with the fence->lock to prevent the prev + * being lost inside an interrupt (it can't be replaced as + * no other caller is allowed to enter __i915_active_fence_set + * as we hold the timeline lock). After serialising with + * the callback, we need to double check which ran first, + * our list_del() or the callback... + */ prev = rcu_access_pointer(active->fence); } Feels a little too tautological, but it is Friday afternoon. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx