On Tue, Mar 23, 2021 at 09:14:36AM +0000, Tvrtko Ursulin wrote: > > On 22/03/2021 16:43, Daniel Vetter wrote: > > On Mon, Mar 22, 2021 at 4:31 PM Tvrtko Ursulin > > <tvrtko.ursulin@xxxxxxxxxxxxxxx> wrote: > > > > > > > > > On 22/03/2021 14:57, Daniel Vetter wrote: > > > > On Mon, Mar 22, 2021 at 3:33 PM Tvrtko Ursulin > > > > <tvrtko.ursulin@xxxxxxxxxxxxxxx> wrote: > > > > > > > > > > > > > > > On 22/03/2021 14:09, Daniel Vetter wrote: > > > > > > On Mon, Mar 22, 2021 at 11:22:01AM +0000, Tvrtko Ursulin wrote: > > > > > > > > > > > > > > On 19/03/2021 22:38, Jason Ekstrand wrote: > > > > > > > > This API allows one context to grab bits out of another context upon > > > > > > > > creation. It can be used as a short-cut for setparam(getparam()) for > > > > > > > > things like I915_CONTEXT_PARAM_VM. However, it's never been used by any > > > > > > > > real userspace. It's used by a few IGT tests and that's it. Since it > > > > > > > > doesn't add any real value (most of the stuff you can CLONE you can copy > > > > > > > > in other ways), drop it. > > > > > > > > > > > > > > No complaints to remove if it ended up unused outside IGT. Latter is a _big_ > > > > > > > problem though, since it is much more that a few IGT tests. So I really > > > > > > > think there really needs to be an evaluation and a plan for that (we don't > > > > > > > want to lose 50% of the coverage over night). > > > > > > > > > > > > > > > There is one thing that this API allows you to clone which you cannot > > > > > > > > clone via getparam/setparam: timelines. However, timelines are an > > > > > > > > implementation detail of i915 and not really something that needs to be > > > > > > > > > > > > > > Not really true timelines are i915 implementation detail. They are in fact a > > > > > > > dma-fence context:seqno concept, nothing more that than. I think you are > > > > > > > probably confusing struct intel_timeline with the timeline wording in the > > > > > > > uapi. Former is i915 implementation detail, but context:seqno are truly > > > > > > > userspace timelines. > > > > > > > > > > > > I think you're both saying the same thing and talking a bit past each > > > > > > another. > > > > > > > > > > > > Yes the timeline is just a string of dma_fence, that's correct. Now > > > > > > usually if you submit batches with execbuf, we have 3 ways to synchronize > > > > > > concurrent submission: implicit sync, sync_file and drm_syncob. They all > > > > > > map to different needs in different protocols/render apis. > > > > > > > > > > > > Now in one additional case the kernel makes sure that batchbuffers are > > > > > > ordered, and that's when you submit them to the same hw ctx. Because > > > > > > there's only 1 hw context and you really can't have batchbuffers run on > > > > > > that single hw context out of order. That's what the timeline object we > > > > > > talk about here is. But that largely is an internal implementation detail, > > > > > > which happens to also use most/all the same infrastructure as the > > > > > > dma_fence uapi pieces above. > > > > > > > > > > > > Now the internal implementation detail leaking here is that we exposed > > > > > > this to userspace, without there being any need for this. What Jason > > > > > > implements with syncobj in the next patch is essentially what userspace > > > > > > should have been using for cross-engine sync. media userspace doesn't care > > > > > > about interop with winsys/client apis, so they equally could have used > > > > > > implicit sync or sync_file here (which I think is the solution now for the > > > > > > new uapi prepped internally), since they all are about equally powerful > > > > > > for stringing batchbuffers together. > > > > > > > > > > Are you saying we exposed a single timeline of execution per hw context > > > > > via the single timeline flag?! > > > > > > > > Nope. > > > > > > > > > Timelines of execution were always exposed. Any "engine" (ring > > > > > previously) in I915_EXEC_RING_MASK was a single timeline of execution. > > > > > It is completely the same with engine map engines, which are also > > > > > different indices into I915_EXEC_RING_MASK space. > > > > > > > > > > Userspace was aware of these timelines forever as well. Media was > > > > > creating multiple contexts to have multiple timelines (so parallelism). > > > > > Everyone knew that engine-hopping submissions needs to be either > > > > > implicitly or explicitly synchronised, etc. > > > > > > > > Yup, I think we're saying the same thing here. > > > > > > > > > So I really don't see that we have leaked timelines as a concept *now*. > > > > > What the patch has exposed to userspace is a new way to sync between > > > > > timelines and nothing more. > > > > > > > > We've leaked it as something you can now share across hw context. > > > > > > Okay so we agree on most things but apparently have different > > > definitions of what it means to leak internal implementation details. > > > > > > While at the same time proof that we haven't leaked the internal > > > implementation details is that Jason was able to implement the single > > > timeline flag with a drm syncobj at the execbuf top level. (Well mostly, > > > ignoring the probably inconsequential difference of one vs multiple > > > fence contexts.) > > > > It's not a matching implementation. It's only good enough for what > > media needs, and essentially what media should have done to begin > > with. > > > > There's substantially different behaviour between SINGLE_TIMELINE and > > what Jason has done here when you race concurrent execbuf calls: > > Former guarantees total ordering, the latter doesn't even try. They > > are not the same thing, but luckily userspace doesn't care about that > > difference. > > Sounds like a very important difference to stress in the commit message. > > Secondly, I am unclear whether we have agreement on whether the single > timeline flag is leaking implementation details of the execlists scheduler > to userspace or not? I do think Jason&me agree on that it does leak an internal concept to userspace that we shouldn't leak. I'm honestly not entirely understanding your argument for why single_timeline isn't an internal concept somehow, and how exposing it to userspace doesn't leak that concept to userspace. Whether internally that concept is now perfectly represented by just struct intel_timeline, or maybe more the seqno/hswp, or more diffused through the code doesn't really change that we have an internal concept that we're now exposing for sharing in ways that wasn't possible before. -Daniel > Regards, > > Tvrtko > > > > > Aside, just to make sure this wont get lost: I do agree that we should > > only allow this up to maybe ADL, and reject it on anything new (maybe > > including dg1 while we're at it, since the pci ids for that aren't > > even close to upstream yet). > > -Daniel > > > > > > Which is possible because of how it's internally implemented (I think > > > > load balancer relies on that), but not really a synchronization > > > > > > Virtual engine is a single timeline by definition and it is still that > > > regardless of the implementation details (execlists or GuC, in both > > > cases it is a single hardware context and a single timeline). > > > > > > > primitive we want to export as such to userspace. We have other > > > > interfaces and concepts for that. > > > > > > Yes, that is the only point to argue IMO. We can say it wasn't needed > > > and should have been avoided, but I still maintain we can't really say > > > we leaked anything backend specific to userspace via it. > > > > > > Regards, > > > > > > Tvrtko > > > > > > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx