02.07.2020 15:10, Mikko Perttunen пишет: > Ok, so we would have two kinds of syncpoints for the job; one > for kernel job tracking; and one that userspace can > manipulate as it wants to. > > Could we handle the job tracking syncpoint completely inside the kernel, > i.e. allocate it in kernel during job submission, and add an increment > for it at the end of the job (with condition OP_DONE)? For MLOCKing, the > kernel already needs to insert a SYNCPT_INCR(OP_DONE) + WAIT + > MLOCK_RELEASE sequence at the end of each job. If sync point is allocated within kernel, then we'll need to always patch all job's sync point increments with the ID of the allocated sync point, regardless of whether firewall enabled or not. Secondly, I'm now recalling that only one sync point could be assigned to a channel at a time on newer Tegras which support sync point protection. So it sounds like we don't really have variants other than to allocate one sync point per channel for the jobs usage if we want to be able to push multiple jobs into channel's pushbuffer, correct? ... >> Hmm, we actually should be able to have a one sync point per-channel for >> the job submission, similarly to what the current driver does! >> >> I'm keep forgetting about the waitbase existence! > > Tegra194 doesn't have waitbases, but if we are resubmitting all the jobs > anyway, can't we just recalculate wait thresholds at that time? Yes, thresholds could be recalculated + job should be re-formed at the push-time then. It also means that jobs always should be formed only at the push-time if wait-command is utilized by cmdstream since the waits always need to be patched because we won't know the thresholds until scheduler actually runs the job. > Maybe a more detailed sequence list or diagram of what happens during > submission and recovery would be useful. The textual form + code is already good enough to me. A diagram could be nice to have, although it may take a bit too much effort to create + maintain it. But I don't mind at all if you'd want to make one :) ... >>> * We should be able to keep the syncpoint refcounting based on fences. >> >> The fence doesn't need the sync point itself, it only needs to get a >> signal when the threshold is reached or when sync point is ceased. >> >> Imagine: >> >> - Process A creates sync point >> - Process A creates dma-fence from this sync point >> - Process A exports dma-fence to process B >> - Process A dies >> >> What should happen to process B? >> >> - Should dma-fence of the process B get a error signal when process A >> dies? >> - Should process B get stuck waiting endlessly for the dma-fence? >> >> This is one example of why I'm proposing that fence shouldn't be coupled >> tightly to a sync point. > > As a baseline, we should consider process B to get stuck endlessly > (until a timeout of its choosing) for the fence. In this case it is > avoidable, but if the ID/threshold pair is exported out of the fence and > is waited for otherwise, it is unavoidable. I.e. once the ID/threshold > are exported out of a fence, the waiter can only see the fence being > signaled by the threshold being reached, not by the syncpoint getting > freed. This is correct. If sync point's FD is exported or once sync point is resolved from a dma-fence, then sync point will stay alive until the last reference to the sync point is dropped. I.e. if Process A dies *after* process B started to wait on the sync point, then it will get stuck.