Quoting Tvrtko Ursulin (2019-11-20 14:16:17) > > On 20/11/2019 13:39, Chris Wilson wrote: > > Quoting Tvrtko Ursulin (2019-11-20 13:16:51) > >> > >> On 20/11/2019 09:33, Chris Wilson wrote: > >>> + > >>> + /* > >>> + * Our goal here is to retire _idle_ timelines as soon as > >>> + * possible (as they are idle, we do not expect userspace > >>> + * to be cleaning up anytime soon). > >>> + * > >>> + * If the tl->active_count is already zero, someone else > >>> + * should have retired the timeline. Equally if the timeline > >>> + * is currently locked, either it is being retired elsewhere > >>> + * or about to be! > >>> + */ > >>> + if (atomic_read(&tl->active_count) && > >>> + mutex_trylock(&tl->mutex)) { > >>> + retire_requests(tl); > >>> + mutex_unlock(&tl->mutex); > >>> + } > >>> + intel_timeline_put(tl); > >>> + > >>> + GEM_BUG_ON(!next); > >>> + tl = ptr_mask_bits(next, 1); > >> > >> You sometimes expect engine->retire to contain 0x1? > > > > Yes, imagine that we are submitting very fast such that we schedule_out > > the same context before the worker ran, we would then try to > > add_retire() the same timeline again. So I was using BIT(0) to tag an > > active element in the retirement list. > > If we have schedule_out on the same context before the retire ran then > add_retire is a no-op due tl->retire being set. The first element in the list has timeline->retire = NULL. We only know it is set because we make it timeline->retire = 1 instead. > It could be a race between engine_retire and add_retire where > engine->retire is set to NULL so latter would set tl->retire to NULL | > BIT(0). > > But BIT(0) is only filtered out and not acted upon anywhere so I am > still lost. > > So maybe from the opposite angle, what goes wrong if you drop all BIT(0) > business? You insert the element into the list twice causing a loop. Now that loop is short lived as we clear timeline->retire on processing, but we may still end up dropping a reference twice. > > > > >>> + } while (tl); > >>> +} > >>> + > >>> +static bool add_retire(struct intel_engine_cs *engine, > >>> + struct intel_timeline *tl) > >>> +{ > >>> + struct intel_timeline *first = READ_ONCE(engine->retire); > >>> + > >>> + /* > >>> + * We open-code a llist here to include the additional tag [BIT(0)] > >>> + * so that we know when the timeline is already on a > >>> + * retirement queue: either this engine or another. > >>> + * > >>> + * However, we rely on that a timeline can only be active on a single > >>> + * engine at any one time and that add_retire() is called before the > >>> + * engine releases the timeline and transferred to another to retire. > >>> + */ > >>> + > >>> + if (READ_ONCE(tl->retire)) /* already queued */ > >>> + return false; > >> > >> Can't this go first in the function? > > > > Conceptually it is. And I made it so because I also decided against > > having the READ_ONCE() at the top. > > But READ_ONCE is at the top, just a different one which is discarded if > this one is NULL. static bool add_retire(struct intel_engine_cs *engine, struct intel_timeline *tl) { struct intel_timeline *first; /* * We open-code a llist here to include the additional tag [BIT(0)] * so that we know when the timeline is already on a * retirement queue: either this engine or another. * * However, we rely on that a timeline can only be active on a single * engine at any one time and that add_retire() is called before the * engine releases the timeline and transferred to another to retire. */ if (READ_ONCE(tl->retire)) /* already queued */ return false; if (!atomic_read(&tl->active_count)) /* already retired */ return false; intel_timeline_get(tl); first = READ_ONCE(engine->retire); do tl->retire = ptr_pack_bits(first, 1, 1); while (!try_cmpxchg(&engine->retire, &first, tl)); return !first; } is the current incarnation. > >>> + > >>> + intel_timeline_get(tl); > >>> + do > >>> + tl->retire = ptr_pack_bits(first, 1, 1); > >> > >> Here you rely on assignment being atomic right? > > > > Ish. Here we rely on the timeline being owned by the engine so it cannot > > be submitted by another (and so schedule_out called) until this engine > > has released it. > > > > It is a weak point for generality, but the ordering is strong in > > execlists. > > > >>> + while (!try_cmpxchg(&engine->retire, &first, tl)); > >> > >> So the loop is effectively creating a chain of timelines to retire on > >> this engine. > >> > >> What happens with virtual engines when a timeline goes to different > >> engine before (well or any single timeline context) the retire worker > >> runs? Ah okay, it gets re-assigned to the most recent engine. > > > > Right. The engine_retire() doesn't care which engine the timelines were > > run on, it's just a list of suspected idle timelines. > > > >> I am not sure about the BIT(0) business. It's always set on write so I > >> am not getting why it is useful. > > > > It's also set to 0 on consumption :) > > Another thing - assert that engine->retire is NULL after flush_work in > intel_engine_fini_retire could be useful. Yup, wilco. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx