On Fri, Apr 07, 2017 at 09:53:05AM +0100, Tvrtko Ursulin wrote: > > On 06/04/2017 09:55, Chris Wilson wrote: > >On Thu, Apr 06, 2017 at 09:18:36AM +0100, Tvrtko Ursulin wrote: > > [snip] [snip] > >>>>+ if (swap_vcs && engine == VCS1) > >>>>+ engine = VCS2; > >>>>+ else if (swap_vcs && engine == VCS2) > >>>>+ engine = VCS1; > >>>>+ w->eb.flags = eb_engine_map[engine]; > >>>>+ w->eb.flags |= I915_EXEC_HANDLE_LUT; > >>>>+ if (!seqnos) > >>>>+ w->eb.flags |= I915_EXEC_NO_RELOC; > >>> > >>>Doesn't look too hard to get the relocation right. Forcing relocations > >>>between batches is probably a good one to check (just to say don't do > >>>that) > >> > >>I am not following here? You are saying don't do relocations at all? > >>How do I make sure things stay fixed and even how to find out where > >>they are in the first pass? > > > >Depending on the workload, it may be informative to also do comparisons > >between NORELOC and always RELOC. Personally I would make sure we were > >using NORELOC as this should be a simulator/example. > > How do I use NORELOC? I mean, I have to know where to objects will > be pinned, or be able to pin them first and know they will remain > put. What am I not understanding here? It will be assigned an address on first execution. Can I quote the spiel I wrote for i915_gem_execbuffer.c and see if that answers how to use NORELOC: * Reserving resources for the execbuf is the most complicated phase. We * neither want to have to migrate the object in the address space, nor do * we want to have to update any relocations pointing to this object. Ideally, * we want to leave the object where it is and for all the existing relocations * to match. If the object is given a new address, or if userspace thinks the * object is elsewhere, we have to parse all the relocation entries and update * the addresses. Userspace can set the I915_EXEC_NORELOC flag to hint that * all the target addresses in all of its objects match the value in the * relocation entries and that they all match the presumed offsets given by the * list of execbuffer objects. Using this knowledge, we know that if we haven't * moved any buffers, all the relocation entries are valid and we can skip * the update. (If userspace is wrong, the likely outcome is an impromptu GPU * hang.) The requirement for using I915_EXEC_NO_RELOC are: * * The addresses written in the objects must match the corresponding * reloc.presumed_offset which in turn must match the corresponding * execobject.offset. * * Any render targets written to in the batch must be flagged with * EXEC_OBJECT_WRITE. * * To avoid stalling, execobject.offset should match the current * address of that object within the active context. * Does that make sense? How questions remain unanswered? Hmm, I usually sum it up as batch[reloc.offset] == reloc.presumed_offset + reloc.delta; and execobj.offset == reloc.presumed_offset must be true at the time of execbuf. Note that upon relocation, batch[reloc.offset], reloc.presumed_offset and execobj.offset are updated. This is important to remember if you are prerecording the reloc/execobj arrays, and not feeding back the results of execbuf between phases. > But in general is this correctly implementing your idea for queue > depth estimation? >From my rough checklist: * writes engine->next_seqno++ after each op (in this case end of batch) * qlen[engine] = engine->next_seqno - *engine->current_seqno; Design looks right. Implementation requires checking... I'll be back. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx