On 1/29/21 7:30 PM, Dmitry Osipenko wrote:
28.01.2021 19:58, Thierry Reding пишет:
On Thu, Jan 28, 2021 at 01:08:54PM +0200, Mikko Perttunen wrote:
On 1/27/21 11:20 PM, Dmitry Osipenko wrote:
26.01.2021 05:45, Mikko Perttunen пишет:
2. We will probably need a dedicated drm_tegra_submit_cmd for sync point
increments. The job's sync point will be allocated dynamically when job
is submitted. We will need a fag for the sync_incr and wait_syncpt
commands, saying "it's a job's sync point increment/wait"
Negative. Like I have explained in previous discussions, with the
current way the usage of hardware resources is much more deterministic
and obvious. I disagree on the point that this is much more complicated
for the userspace. Separating syncpoint and channel allocation is one of
the primary motivations of this series for me.
Sync points are a limited resource. The most sensible way to work around
it is to keep sync points within kernel as much as possible. This is not
only much simpler for user space, but also allows to utilize DRM API
properly without re-inventing what already exists and it's easier to
maintain hardware in a good state.
I've spent the last few years designing for automotive and industrial
products, where we don't want to at runtime notice that the system is out of
free syncpoints and because of that we can only process the next camera
frame in a second or two instead of 16 milliseconds. We need to know that
once we have allocated the resource, it is there. The newer chips are also
designed to support this.
Considering Linux is increasingly being used for such applications, and they
are important target markets for NVIDIA, these need to be supported.
Because of the above design constraint the userspace software that runs in
these environments also expects resources to be allocated up front. This
isn't a matter of having to design that software according to what kind of
allocation API we decide do at Linux level -- it's no use designing for
dynamic allocation if it leads to you not meeting the safety requirement of
needing to ensure you have all resources allocated up front.
This isn't a good design feature just in a car, but in anything that needs
to be reliable. However, it does pose some tradeoffs, and if you think that
running out of syncpoints on T20-T114 because of upfront allocation is an
actual problem, I'm not opposed to having both options available.
The word "reliable" contradicts to the error-prone approach. On the
other hand, it should be very difficult to change the stubborn
downstream firmware and we want driver to be usable as much as possible,
so in reality not much can be done about it.
Depends on the perspective.
I think that's a fair point. I don't see why we can't support both
implicit and explicit syncpoint requests. If most of the use-cases can
work with implicit syncpoints and let the kernel handle all aspects of
it, that's great. But there's no reason we can't provide more explicit
controls for cases where it's really important that all the resources
are allocated upfront.
Ultimately this is very specific to each use-case, so I think having
both options will provide us with the most flexibility.
It should be fine to support both. This will add complexity to the
driver, thus it needs to be done wisely.
I'll need more time to think about it.
How about something like this:
Turn the syncpt_incr field back into an array of structs like
#define DRM_TEGRA_SUBMIT_SYNCPT_INCR_REPLACE_SYNCOBJ (1<<0)
#define DRM_TEGRA_SUBMIT_SYNCPT_INCR_PATCH_DYNAMIC_SYNCPT (1<<1)
struct drm_tegra_submit_syncpt_incr {
/* can be left as zero if using dynamic syncpt */
__u32 syncpt_id;
__u32 flags;
struct {
__u32 syncobj;
__u32 value;
} fence;
/* patch word as such:
* *word = *word | (syncpt_id << shift)
*/
struct {
__u32 gather_offset_words;
__u32 shift;
} patch;
};
So this will work similarly to the buffer reloc system; the kernel
driver will allocate a job syncpoint and patch in the syncpoint ID if
requested, and allows outputting syncobjs for each increment.
Mikko