Re: [PATCH] drm/fourcc: add LINEAR modifiers with an exact pitch alignment

James Jones <jajones@xxxxxxxxxx> · Tue, 14 Jan 2025 09:55:44 -0800

I don't see how that fits in the current modifier usage patterns. I'm 
not clear how applications are supposed to programmatically "look at the 
modifiers of other drivers to find commonalities," nor how that "keeps 
"expectations the same as today for simplicity.". I think replacing the 
existing linear modifier would be very disruptive, and I don't think 
this proposal solves a general problem. Is it common for other vendors' 
hardware to have such strict pitch/height alignment requirements? Prior 
to this discussion, I'd only ever heard of minimum alignments.

Thanks,
-James

On 1/14/25 01:38, Marek Olšák wrote:
I would keep the existing modifier interfaces, API extensions, and 
expectations the same as today for simplicity.

The new linear modifier definition (proposal) will have these fields:
    5 bits for log2 pitch alignment in bytes
    5 bits for log2 height alignment in rows
    5 bits for log2 offset alignment in bytes
    5 bits for log2 minimum pitch in bytes
    5 bits for log2 minimum (2D) image size in bytes

The pitch and the image size in bytes are no longer arbitrary values. 
They are fixed values computed from {width, height, bpp, modifier} as 
follows:
    aligned_width = align(width * bpp / 8, 1 << log2_pitch_alignment);
    aligned_height = align(height, 1 << log2_height_alignment);
    pitch = max(1 << log2_minimum_pitch, aligned_width);
    image_size = max(1 << log2_minimum_image_size, pitch * aligned_height);

The modifier defines the layout exactly and non-ambiguously. 
Overaligning the pitch or height is not supported. Only the offset 
alignment has some freedom regarding placement. Drivers can expose 
whatever they want within that definition, even exposing only 1 linear 
modifier is OK. Then, you can look at modifiers of other drivers if you 
want to find commonalities.

DRM_FORMAT_MOD_LINEAR needs to go because it prevents apps from 
detecting whether 2 devices have 0 compatible memory layouts, which is a 
useful thing to know.

Marek

On Fri, Jan 10, 2025 at 4:23 PM James Jones <jajones@xxxxxxxxxx 
<mailto:jajones@xxxxxxxxxx>> wrote:

    On 12/19/24 10:03, Simona Vetter wrote:
     > On Thu, Dec 19, 2024 at 09:02:27AM +0000, Daniel Stone wrote:
     >> On Wed, 18 Dec 2024 at 10:32, Brian Starkey
    <brian.starkey@xxxxxxx <mailto:brian.starkey@xxxxxxx>> wrote:
     >>> On Wed, Dec 18, 2024 at 11:24:58AM +0000, Simona Vetter wrote:
     >>>> For that reason I think linear modifiers with explicit pitch/size
     >>>> alignment constraints is a sound concept and fits into how
    modifiers work
     >>>> overall.
     >>>
     >>> Could we make it (more) clear that pitch alignment is a "special"
     >>> constraint (in that it's really a description of the buffer
    layout),
     >>> and that constraints in-general shouldn't be exposed via modifiers?
     >>
     >> It's still worryingly common to see requirements for contiguous
     >> allocation, if for no other reason than we'll all be stuck with
     >> Freescale/NXP i.MX6 for a long time to come. Would that be in scope
     >> for expressing constraints via modifiers as well, and if so,
    should we
     >> be trying to use feature bits to express this?
     >>
     >> How this would be used in practice is also way too
    underdocumented. We
     >> need to document that exact-round-up 64b is more restrictive than
     >> any-multiple-of 64b is more restrictive than 'classic' linear.
    We need
     >> to document what people should advertise - if we were starting from
     >> scratch, the clear answer would be that anything which doesn't care
     >> should advertise all three, anything advertising any-multiple-of
     >> should also advertise exact-round-up, etc.
     >>
     >> But we're not starting from scratch, and since linear is 'special',
     >> userspace already has explicit knowledge of it. So AMD is going to
     >> have to advertise LINEAR forever, because media frameworks know
    about
     >> DRM_FORMAT_MOD_LINEAR and pass that around explicitly when they know
     >> that the buffer is linear. That and not breaking older userspace
     >> running in containers or as part of a bisect or whatever.
     >>
     >> There's also the question of what e.g. gbm_bo_get_modifier() should
     >> return. Again, if we were starting from scratch, most restrictive
     >> would make sense. But we're not, so I think it has to return LINEAR
     >> for maximum compatibility (because modifiers can't be morphed into
     >> other ones for fun), which further cements that we're not removing
     >> LINEAR.
     >>
     >> And how should allocators determine what to go for? Given that, I
     >> think the only sensible semantics are, when only LINEAR has been
     >> passed, to pick the most restrictive set possible; when LINEAR
     >> variants have been passed as well as LINEAR, to act as if LINEAR
    were
     >> not passed at all.
     >
     > Yeah I think this makes sense, and we'd need to add that to the
    kerneldoc
     > about how drivers/apps/frameworks need to work with variants of
    LINEAR.
     >
     > Just deprecating LINEAR does indeed not work. The same way it was
    really
     > hard slow crawl (and we're still not there everywhere, if you include
     > stuff like bare metal Xorg) trying to retire the implied
    modifier. Maybe,
     > in an extremely bright future were all relevant drivers advertise
    a full
     > set of LINEAR variants, and all frameworks understand them, we'll get
     > there. But if AMD is the one special case that really needs this
    I don't
     > think it's realistic to plan for that, and what Daniel describe above
     > looks like the future we're stuck to.
     > -Sima

    I spent some time thinking about this over the break, because on a venn
    diagram it does overlap a sliver of the work we've done to define the
    differences between the concepts of constraints Vs. capabilities in the
    smorgasbord of unified memory allocator talks/workshops/prototypes/etc.
    over the years. I'm not that worried about some overlap being
    introduced, because every reasonable rule deserves an exception here
    and
    there, but I have concerns similar to Daniel's and Brian's.

    Once you start adding more than one special modifier, some things in
    the
    existing usage start to break down. Right now you can naively pass
    around modifiers, then somewhere either before or just after allocation
    depending on your usage, check if LINEAR is available and take your
    special "I can parse this thing" path, for whatever that means in your
    special use case. Modifying all those paths to include one variant of
    linear is probably OK-but-not-great. Modifying all those paths to
    include <N> variants of linear is probably unrealistic, and I do worry
    about slippery slopes here.

    ---

    What got me more interested though was this led to another thought. At
    first I didn't notice that this was an exact-match constraint and
    thought it meant the usual alignment constraint of >=, and I was
    concerned about how future variants would interact poorly. It could
    still be a concern if things progress down this route, and we have
    vendor A requiring >= 32B alignment and vendor B requiring == 64B
    alignment. They're compatible, but modifiers expressing this would
    naively cancel each-other out unless vendor A proactively advertised ==
    64B linear modifiers too. This isn't a huge deal at that scale, but it
    could get worse, and it got me thinking about a way to solve the
    problem
    of a less naive way to merge modifier lists.

    As a background, the two hard problems left with implementing a
    constraint system to sit alongside the format modifier system are:

    1) A way to name special heaps (E.g., local vidmem on device A) in the
    constraints in a way that spans process boundaries using some sort of
    identifier. There are various ways to solve this. Lately the
    thinking is
    something around dma heaps, but no one's fleshed it out yet that I'm
    aware.

    2) A transport that doesn't require us to revise every userspace API,
    kernel API, and protocol that got revised to support DRM format
    modifiers, and every API/protocol introduced since.

    I haven't seen any great ideas for the latter problem yet, but what if
    we did this:

    - Introduced a new DRM format modifier vendor that was actually
    vendor-agnostic, but implied the format modifier was a constraint
    definition fragment instead of an actual modifier.

    - Constraint-aware code could tack on its constraints (The ones it
    requires and/or the ones it can support allocating) as a series of
    additional modifiers using this vendor code. A given constraint
    might be
    fragmented into multiple modifiers, but their definition and
    serialization/deserialization mechanism could be defined in
    drm_fourcc.h
    as macros all the clients could use.

    - Existing non-constraint-aware code in a modifier usage chain might
    filter these modifiers out using the existing strict intersection
    logic.
    Hence, any link in the chain not aware of constraints would likely
    block
    their use, but that's OK. We're muddling along without them now. It
    wouldn't make those situations any worse.

    - New code would be required to use some minimal library (Header-only
    perhaps, as Simon and I proposed a few years ago) to intersect format
    modifier lists instead, and this code would parse out the constraint
    modifiers from each input list and use the envisioned per-constraint
    logic to merge them. It would result in yet another merged
    modifier+constraint list encoded as a list of modifiers that could be
    passed along through any format-modifier-aware API.

    - One consideration that would be sort of tricky is that constraints
    are
    supposed to be advertised per-modifier, so you'd have to have a way to
    associate constraint modifiers in a given set with real modifiers in
    that set or in general. This is easily solved though. Some bits of the
    constraint modifiers would already need to be used to associate and
    order constraint fragments during deserialization, since modifier lists
    aren't strictly ordered.

    This effectively allows you to use format modifiers to encode
    arbitrarily complex constraint mechanisms by piggybacking on the
    existing format modifier definition and transport mechanisms without
    breaking backwards compatibility. It's a little dirty, because
    modifiers
    are being abused to implement a raw bitstream, but modifiers and
    constraints are related concepts, so it's not a complete hack. It still
    requires modifying all the implementations in the system to fully make
    use of constraints, but doesn't require e.g. revising X11 DRI3 protocol
    again to tunnel them through Xwayland, and in situations where the
    constraint-aware thing sits downstream of the non-constraint-aware
    thing
    in the allocation pipeline, you could get some benefit even when all
    the
    upstream things aren't updated yet, because it could still merge in its
    local constraints before allocating or passing the modifier list down
    the chain.

    Does this seem like something worth pursuing to others? I've been
    trying
    to decide how to best move the allocation constraints efforts forward,
    so it's potentially something I could put some time into this year.

    Thanks,
    -James