Re: [PATCH v5 15/21] drm/tegra: Add new UAPI to header

Thierry Reding <thierry.reding@xxxxxxxxx> · Tue, 23 Mar 2021 18:57:44 +0100

On Tue, Mar 23, 2021 at 08:32:50PM +0300, Dmitry Osipenko wrote:
> 23.03.2021 19:44, Thierry Reding пишет:
> > On Tue, Mar 23, 2021 at 05:00:30PM +0300, Dmitry Osipenko wrote:
> >> 23.03.2021 15:30, Thierry Reding пишет:
> >>> On Thu, Jan 14, 2021 at 12:34:22PM +0200, Mikko Perttunen wrote:
> >>>> On 1/14/21 10:36 AM, Dmitry Osipenko wrote:
> >>>>> 13.01.2021 21:56, Mikko Perttunen пишет:
> >>>>>> On 1/13/21 8:14 PM, Dmitry Osipenko wrote:
> >>>>>>> 11.01.2021 16:00, Mikko Perttunen пишет:
> >>>>>>>> +struct drm_tegra_submit_buf {
> >>>>>>>> +    /**
> >>>>>>>> +     * @mapping_id: [in]
> >>>>>>>> +     *
> >>>>>>>> +     * Identifier of the mapping to use in the submission.
> >>>>>>>> +     */
> >>>>>>>> +    __u32 mapping_id;
> >>>>>>>
> >>>>>>> I'm now in process of trying out the UAPI using grate drivers and this
> >>>>>>> becomes the first obstacle.
> >>>>>>>
> >>>>>>> Looks like this is not going to work well for older Tegra SoCs, in
> >>>>>>> particular for T20, which has a small GART.
> >>>>>>>
> >>>>>>> Given that the usefulness of the partial mapping feature is very
> >>>>>>> questionable until it will be proven with a real userspace, we should
> >>>>>>> start with a dynamic mappings that are done at a time of job submission.
> >>>>>>>
> >>>>>>> DRM already should have everything necessary for creating and managing
> >>>>>>> caches of mappings, grate kernel driver has been using drm_mm_scan for a
> >>>>>>> long time now for that.
> >>>>>>>
> >>>>>>> It should be fine to support the static mapping feature, but it should
> >>>>>>> be done separately with the drm_mm integration, IMO.
> >>>>>>>
> >>>>>>> What do think?
> >>>>>>>
> >>>>>>
> >>>>>> Can you elaborate on the requirements to be able to use GART? Are there
> >>>>>> any other reasons this would not work on older chips?
> >>>>>
> >>>>> We have all DRM devices in a single address space on T30+, hence having
> >>>>> duplicated mappings for each device should be a bit wasteful.
> >>>>
> >>>> I guess this should be pretty easy to change to only keep one mapping per
> >>>> GEM object.
> >>>
> >>> The important point here is the semantics: this IOCTL establishes a
> >>> mapping for a given GEM object on a given channel. If the underlying
> >>> implementation is such that the mapping doesn't fit into the GART, then
> >>> that's an implementation detail that the driver needs to take care of.
> >>> Similarly, if multiple devices share a single address space, that's
> >>> something the driver already knows and can take advantage of by simply
> >>> reusing an existing mapping if one already exists. In both cases the
> >>> semantics would be correctly implemented and that's really all that
> >>> matters.
> >>>
> >>> Overall this interface seems sound from a high-level point of view and
> >>> allows these mappings to be properly created even for the cases we have
> >>> where each channel may have a separate address space. It may not be the
> >>> optimal interface for all use-cases or any one individual case, but the
> >>> very nature of these interfaces is to abstract away certain differences
> >>> in order to provide a unified interface to a common programming model.
> >>> So there will always be certain tradeoffs.
> >>
> >> For now this IOCTL isn't useful from a userspace perspective of older
> >> SoCs and I'll need to add a lot of code that won't do anything useful
> >> just to conform to the specific needs of the newer SoCs. Trying to unify
> >> everything into a single API doesn't sound like a good idea at this
> >> point and I already suggested to Mikko to try out variant with a
> >> separated per-SoC code paths in the next version, then the mappings
> >> could be handled separately by the T186+ paths.
> > 
> > I'm not sure I understand what you're saying. Obviously the underlying
> > implementation of this might have to differ depending on SoC generation.
> > But it sounds like you're suggesting having different UAPIs depending on
> > SoC generation.
> 
> I suggested that this IOCTL shouldn't be mandatory for older SoCs, which
> we should to have anyways for preserving the older UAPI. Yes, this makes
> UAPI different and I want to see how it will look like in the next
> version since the current variant was sub-optimal.

What exactly is sub-optimal about the current variant? And what would an
alternative look like? Like what we have in the old ABI where we pass in
GEM handles directly during submissions?

I can see how this new variant would be a bit more work than the
alternative, but even on older SoCs, wouldn't the explicit mapping be
much better for performance than having to constantly remap GEM objects
for every job?

In general I don't think it's useful to have separate UAPIs for what's
basically the same hardware. I mean from a high-level point of view what
we need to do for each job remains exactly the same whether the job is
executed on Tegra20 or Tegra194. We have to map a bunch of buffers so
that they can be accessed by hardware and then we have a command stream
that references the mappings and does something to the memory that they
represent. The only thing that's different between the SoC generations
is how these mappings are created.

The difference between the old UABI and this is that we create mappings
upfront, and I'm not sure I understand how that could be suboptimal. If
anything it should increase the efficiency of job submissions by
reducing per-submit overhead. It should be fairly easy to compare this
in terms of performance with implicit mappings by running tests against
the old UABI and the new one. If there's a significant impact we may
need to take a closer look.

Yes, this will require a bit of work in userspace to adapt to these
changes, but those are a one-time cost, so I don't think it's wise to
ignore potential performance improvements because we don't want to
update userspace.

In either case, I don't think we're quite done yet. There's still a bit
of work left to do on the userspace side to get a couple of use-cases up
with this new UABI and it's not entirely clear yet what the results will
be. However, we have to move forward somehow or this will end up being
yet another attempt that didn't go anywhere. We were in a similar place
a few years back and I vividly remember how frustrating that was for me
personally to spend all of that time working through this stuff and then
seeing it all go to waste.

So can we please keep going for a little longer while there's still
momentum?

Thierry
Attachment:
signature.asc

Description: PGP signature
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel