Re: [RFC] Host1x/TegraDRM UAPI (sync points)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 6/29/20 10:42 PM, Dmitry Osipenko wrote:

Secondly, I suppose neither GPU, nor DLA could wait on a host1x sync
point, correct? Or are they integrated with Host1x HW?


They can access syncpoints directly. (That's what I alluded to in the "Introduction to the hardware" section :) all those things have hardware access to syncpoints)

>
> .. rest ..
>

Let me try to summarize once more for my own understanding:

* When submitting a job, you would allocate new syncpoints for the job
* After submitting the job, those syncpoints are not usable anymore
* Postfences of that job would keep references to those syncpoints so they aren't freed and cleared before the fences have been released * Once postfences have been released, syncpoints would be returned to the pool and reset to zero

The advantage of this would be that at any point in time, there would be a 1:1 correspondence between allocated syncpoints and jobs; so you could shuffle the jobs around channels or reorder them.

Please correct if I got that wrong :)

---

I have two concerns:

* A lot of churn on syncpoints - any time you submit a job you might not get a syncpoint for an indefinite time. If we allocate syncpoints up-front at least you know beforehand, and then you have the syncpoint as long as you need it. * Plumbing the dma-fence/sync_file everywhere, and keeping it alive until waits on it have completed, is more work than just having the ID/threshold. This is probably mainly a problem for downstream, where updating code for this would be difficult. I know that's not a proper argument but I hope we can reach something that works for both worlds.

Here's a proposal in between:

* Keep syncpoint allocation and submission in jobs as in my original proposal
* Don't attempt to recover user channel contexts. What this means:
* If we have a hardware channel per context (MLOCKing), just tear down the channel * Otherwise, we can just remove (either by patching or by full teardown/resubmit of the channel) all jobs submitted by the user channel context that submitted the hanging job. Jobs of other contexts would be undisturbed (though potentially delayed, which could be taken into account and timeouts adjusted) * If this happens, we can set removed jobs' post-fences to error status and user will have to resubmit them.
* We should be able to keep the syncpoint refcounting based on fences.

This can be made more fine-grained by not caring about the user channel context, but tearing down all jobs with the same syncpoint. I think the result would be that we can get either what you described (or how I understood it in the summary in the beginning of the message), or a more traditional syncpoint-per-userctx workflow, depending on how the userspace decides to allocate syncpoints.

If needed, the kernel can still do e.g. reordering (you mentioned job priorities) at syncpoint granularity, which, if the userspace followed the model you described, would be the same thing as job granularity.

(Maybe it would be more difficult with current drm_scheduler, sorry, haven't had the time yet to read up on that. Dealing with clearing work stuff up before summer vacation)

Mikko



[Index of Archives]     [ARM Kernel]     [Linux ARM]     [Linux ARM MSM]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux