Re: [PATCH v3] drm/fourcc: document modifier uniqueness requirements

Alex Deucher <alexdeucher@xxxxxxxxx> · Mon, 1 Jun 2020 10:25:24 -0400

On Fri, May 29, 2020 at 11:03 AM Daniel Stone <daniel@xxxxxxxxxxxxx> wrote:
>
> On Fri, 29 May 2020 at 15:36, Alex Deucher <alexdeucher@xxxxxxxxx> wrote:
> > On Fri, May 29, 2020 at 10:32 AM Daniel Stone <daniel@xxxxxxxxxxxxx> wrote:
> > > On Fri, 29 May 2020 at 15:29, Alex Deucher <alexdeucher@xxxxxxxxx> wrote:
> > > > Maybe I'm over thinking this.  I just don't want to get into a
> > > > situation where we go through a lot of effort to add modifier support
> > > > and then performance ends up being worse than it is today in a lot of
> > > > cases.
> > >
> > > I'm genuinely curious: what do you imagine could cause a worse result?
> >
> > As an example, in some cases, it's actually better to use linear for
> > system memory because it better aligns with pcie access patterns than
> > some tiling formats (which are better aligned for the memory
> > controller topology on the dGPU).  That said, I haven't been in the
> > loop as much with the tiling formats on newer GPUs, so that may not be
> > as much of an issue anymore.
>
> Yeah, that makes a lot of sense. On the other hand, placement isn't
> explicitly encoded for either modifiers or non-modifiers, so I'm not
> sure how it would really regress.
>
> In case it was missed somewhere, there is no generic code doing
> modifier selection for modifier optimality anywhere. The flow is:
>   - every producer/consumer advertises a list of modifier + format
> pairs, declaring what they _can_ support
>   - for every use where a buffer needs to be allocated, the generic
> code intersects these lists of modifiers to determine the set of
> modifiers mutually acceptable to all consumers
>   - the buffer allocator is always handed a _list_ of modifiers, and
> makes its own decision based on ??
>
> For a concrete end-to-end example:
>   - KMS declares which modifiers are supported for scanout
>   - EGL declares which modifiers are supported for EGLImage import
>   - Weston determines that one of its clients could be directly
> scanned out rather than composited
>   - Weston intersects the KMS + EGL set of modifiers to come up with
> the optimal modifier set (i.e. bypassing composition)
>   - Weston sends this intersected list to the client via the Wayland
> protocol (mentioned in previous MR)
>   - the client is using EGL, so Mesa receives this list of modifiers,
> and passes this on to amdgpu
>   - amdgpu uses magic inscrutable heuristics to determine the most
> optimal modifier to use, and allocates a buffer based on that
>
> Weston (or GNOME Shell, or Chromium, or whatever) will never be in a
> position as a generic client to know that on Raven2 it should use a
> particular supertiled layout with no DCC if width > 2048. So we
> designed the entire framework to explicitly avoid generic code trying
> to reason about the performance properties of specific modifiers.
>
> What Weston _does_ know, however, is that display controller can work
> with modifier set A, and the GPU can work with modifier set B, and if
> the client can pick something from modifier set A, then there is a
> much greater probability that Weston can leave the GPU alone so it can
> be entirely used by the client. It also knows that if the surface
> can't be directly scanned out for whatever reason, then there's no
> point in the client optimising for direct scanout, and it can tell the
> client to select based on optimality purely for the GPU.

Just so I understand this correctly, the main reason for this is to
deal with display hardware and render hardware from different vendors
which may or may not support any common formats other than linear.  It
provides a way to tunnel device capabilities between the different
drivers.  In the case of a device with display and rendering on the
same device or multiple devices from the same vendor, it not really
that useful.  It doesn't seem to provide much over the current EGL
hints (SCANOUT, SECURE, etc.).  I still don't understand how it solves
the DCC problem though.  Compression and encryption seem kind like
meta modifiers.  There is an under laying high level layout, linear,
tiled, etc. but it could also be compressed and/or encrypted.  Is the
idea that those are separate modifiers?  E.g.,
0: linear
1: linear | encrypted
2. linear | compressed
3: linear | encrypted | compressed
4: tiled1
5: tiled1 | encrypted
6: tiled1 | compressed
7: tiled1 | encrypted | compressed
etc.
Or that the modifiers only expose the high level layout, and it's then
up the the driver(s) to enable compression, etc. if both sides have a
compatible layout?

Thanks,

Alex

>
> So that's the thinking behind the interface: that the driver still has
> exactly as much control and ability to use magic heuristics as it
> always has, but that system components can supplement the driver's
> heuristics with their own knowledge, to increase the chance that the
> driver's heuristics arrive at a configuration that a) will definitely
> work, and b) have a much greater chance of working optimally.
>
> Does that help at all?
>
> Cheers,
> Daniel
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel