On 2021-10-14 15:44, Shankar, Uma wrote: > > >> -----Original Message----- >> From: Pekka Paalanen <ppaalanen@xxxxxxxxx> >> Sent: Wednesday, October 13, 2021 2:01 PM >> To: Shankar, Uma <uma.shankar@xxxxxxxxx> >> Cc: harry.wentland@xxxxxxx; ville.syrjala@xxxxxxxxxxxxxxx; intel- >> gfx@xxxxxxxxxxxxxxxxxxxxx; dri-devel@xxxxxxxxxxxxxxxxxxxxx; >> brian.starkey@xxxxxxx; sebastian@xxxxxxxxxxxxxxxxx; >> Shashank.Sharma@xxxxxxx >> Subject: Re: [RFC v2 01/22] drm: RFC for Plane Color Hardware Pipeline >> >> On Tue, 12 Oct 2021 20:58:27 +0000 >> "Shankar, Uma" <uma.shankar@xxxxxxxxx> wrote: >> >>>> -----Original Message----- >>>> From: Pekka Paalanen <ppaalanen@xxxxxxxxx> >>>> Sent: Tuesday, October 12, 2021 4:01 PM >>>> To: Shankar, Uma <uma.shankar@xxxxxxxxx> >>>> Cc: intel-gfx@xxxxxxxxxxxxxxxxxxxxx; >>>> dri-devel@xxxxxxxxxxxxxxxxxxxxx; harry.wentland@xxxxxxx; >>>> ville.syrjala@xxxxxxxxxxxxxxx; brian.starkey@xxxxxxx; >>>> sebastian@xxxxxxxxxxxxxxxxx; Shashank.Sharma@xxxxxxx >>>> Subject: Re: [RFC v2 01/22] drm: RFC for Plane Color Hardware >>>> Pipeline >>>> >>>> On Tue, 7 Sep 2021 03:08:43 +0530 Uma Shankar >>>> <uma.shankar@xxxxxxxxx> wrote: >>>> >>>>> This is a RFC proposal for plane color hardware blocks. >>>>> It exposes the property interface to userspace and calls out the >>>>> details or interfaces created and the intended purpose. >>>>> >>>>> Credits: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> >>>>> Signed-off-by: Uma Shankar <uma.shankar@xxxxxxxxx> >>>>> --- >>>>> Documentation/gpu/rfc/drm_color_pipeline.rst | 167 >>>>> +++++++++++++++++++ >>>>> 1 file changed, 167 insertions(+) create mode 100644 >>>>> Documentation/gpu/rfc/drm_color_pipeline.rst >>>>> >>>>> diff --git a/Documentation/gpu/rfc/drm_color_pipeline.rst >>>>> b/Documentation/gpu/rfc/drm_color_pipeline.rst >>>>> new file mode 100644 >>>>> index 000000000000..0d1ca858783b >>>>> --- /dev/null >>>>> +++ b/Documentation/gpu/rfc/drm_color_pipeline.rst >>>>> @@ -0,0 +1,167 @@ >>>>> +================================================== >>>>> +Display Color Pipeline: Proposed DRM Properties >> >> ... >> >>>>> +Proposal is to have below properties for a plane: >>>>> + >>>>> +* Plane Degamma or Pre-Curve: >>>>> + * This will be used to linearize the input framebuffer data. >>>>> + * It will apply the reverse of the color transfer function. >>>>> + * It can be a degamma curve or OETF for HDR. >>>> >>>> As you want to produce light-linear values, you use EOTF or inverse OETF. >>>> >>>> The term OETF has a built-in assumption that that happens in a camera: >>>> it takes in light and produces and electrical signal. Lately I >>>> have personally started talking about non-linear encoding of color >>>> values, since EOTF is often associated with displays if nothing >>>> else is said (taking >> in an electrical signal and producing light). >>>> >>>> So this would be decoding the color values into light-linear color >>>> values. That is what an EOTF does, yes, but I feel there is a >>>> nuanced difference. A piece of equipment implements an EOTF by >>>> turning an electrical signal into light, hence EOTF often refers >>>> to specific equipment. You could talk about content EOTF to denote >>>> content value encoding, as opposed to output or display EOTF, but >>>> that might be confusing if you look at e.g. the diagrams in >>>> BT.2100: is it the EOTF >> or is it the inverse OETF? Is the (inverse?) OOTF included? >>>> >>>> So I try to side-step those questions by talking about encoding. >>> >>> The idea here is that frame buffer presented to display plane engine >>> will be non- >> linear. >>> So output of a media decode should result in content with EOTF applied. >> >> Hi, >> >> sure, but the question is: which EOTF. There can be many different >> things called "EOTF" in a single pipeline, and then it's up to the >> document writer to make the difference between them. Comparing two >> documents with different conventions causes a lot of confusion in my >> personal experience, so it is good to define the concepts more carefully. > > Hi Pekka, > I think you have given some nice inputs to really deep dive and document here. > Will update this with the right expectations wrt what transfer functions to be used at various stages in the operation pipeline. > >>> So output of a media decode should result in content with EOTF applied. >> >> I suspect you have it backwards. Media decode produces electrical >> (non-linear) pixel color values. If EOTF was applied, they would be >> linear instead (and require more memory to achieve the same visual precision). >> >> If you want to put it this way, you could say "with inverse EOTF >> applied", but that might be slightly confusing because it is already >> baked in to the video, it's not something a media decoder has to >> specifically apply, I think. However, the (inverse) EOTF in this case is the content EOTF, not the display EOTF. >> >> If content and display EOTF differ, then one must apply first content >> EOTF and then inverse display EOTF to get values that are correctly >> encoded for the display. (This is necessary but not sufficient in >> general.) Mind, that this is not an OOTF nor an artistic adjustment, >> this is purely a value encoding conversion. >> > 0 > > Got it, thanks for correcting. Will also do some more study around this and update. > >>> Playback transfer function (EOTF): inverse OETF plus rendering intent gamma. >> >> Does "rendering intent gamma" refer to artistic adjustments, not OOTF? >> ] > > This ideally should be OOTF adjustments here. > >> cf. BT.2100 Annex 1, "The relationship between the OETF, the EOTF and >> the OOTF", although I find those diagrams somewhat confusing still. It >> does not seem to clearly account for transmission non-linear encoding being different from the display EOTF. >> >> Different documents use OOTF to refer to different things. Then there >> is also the fundamental difference between PQ and HLG systems, where >> OOTF is by definition in different places of the camera-transmission-display pipeline. > > Agree this is a bit confusing, I admit I was much more confused than what you were for sure. > Will do some research to get my understanding in place. Thanks for all the inputs. > >>> >>> To make it linear, we should apply the OETF. Confusion is whether >>> OETF is equivalent to inverse EOTF, we could check this one out to confirm. >> >> OETF does not make anything linear. By definition, OETF converts >> optical (linear) values into electrical (practically always non-linearly encoded) values. >> >> Yes, EOTF is almost always not equal to inverse OETF because of the >> existence of OOTF. The applies to inverse EOTF vs. OETF as well. >> >> From what I have learned, it is unexpected to assume that inverse of >> one is the other. That will cause confusion. Therefore, if you mean >> the inverse, spell it out as inverse. I used to make this same >> mistake, and some of it may still be left in https://gitlab.freedesktop.org/pq/color-and-hdr . > > Got it. > >>> >>> Here since the values for the pre-curve (or degamma as we have >>> called it in past), accepts programmable lut values which makes it >>> flexible enough to >> accommodate any values. >>> This will hold for HDR as well as traditional gamma encoded SRGB data as well. >> >> Yes. >> >>>>> + * This linear data can be further acted on by the following >>>>> + * color hardware blocks in the display hardware pipeline >>>> >>>> I think this and the above description ties the intended use down >>>> too much. This is one possible way to use degamma, yes, but there >>>> may be others. Particularly if CTM can be replaced with a 3D LUT, >>>> then the degamma is more likely a shaper (non-linear adjustment to >>>> 3D LUT tap >> positions). >>> >>> Yeah agree, this is just one of the usecases. Its just a lut table >>> which can be used for other purposes as well and is not just limited >>> to a >> linearization operation. I will update this. >>> >>>> I would prefer the name pre-curve to underline that this can be >>>> whatever one wants it to be, but I understand that people may be >>>> more familiar >> with the name degamma. >>> >>> I feel pre-curve should be fine but yeah it deviates from naming of >>> legacy crtc/pipe >> color properties. >>> May be we can stay with legacy naming with more documentation to >>> have its >> usecases clearly called out. >> >> That is fine by me. >> >>>>> + >>>>> +UAPI Name: PLANE_DEGAMMA_MODE >>>>> +Description: Enum property with values as blob_id's which >>>>> +advertizes >>>>> the >>>> >>>> Is enum with blob id values even a thing? >>> >>> Yeah we could have. This is a dynamic enum created with blobs. Each >>> entry contains the data structure to extract the full color >>> capabilities of the hardware. It’s a very interesting play with >>> blobs (@ville.syrjala@xxxxxxxxxxxxxxx brainchild) >> >> Yes, I think I can imagine how that works. The blobs are created >> immutable, unable to change once the plane has been advertised to >> userspace, because their IDs are used as enum values. But that is ok, >> because the blob describes capabilities that cannot change during the >> device's lifetime... or can they? What if you would want to extend the >> blob format, do you need a KMS client cap to change the format which >> would be impossible because you can't change an enum definition after it has been installed so you cannot swap the blob to a new one? >> >> This also relies on enums being identified by their string names, >> because the numerical value is not a constant but a blob ID and gets >> determined when the immutable blob is created. >> >> Did I understand correctly? > > Yes that’s right. We are not expecting these structures to change as they represent the platforms capabilities. > >> As a userspace developer, I can totally work with that. >> >>>>> + possible degamma modes and lut ranges supported by the platform. >>>>> + This allows userspace to query and get the plane degamma color >>>>> + caps and choose the appropriate degamma mode and create lut values >>>>> + accordingly. >>>> >>>> I agree that some sort of "mode" switch is necessary, and >>>> advertisement of capabilities as well. >>>> >>> >>> This enum with blob id's is an interesting way to advertise segmented lut tables. >>> >>>>> + >>>>> +UAPI Name: PLANE_DEGAMMA_LUT >>>>> +Description: Blob property which allows a userspace to provide LUT values >>>>> + to apply degamma curve using the h/w plane degamma processing >>>>> + engine, thereby making the content as linear for further color >>>>> + processing. Userspace gets the size of LUT and precision etc >>>>> + from PLANE_DEGAMA_MODE_PROPERTY >>>> >>>> So all degamma modes will always be some kind of LUT? That may be >>>> a bit restrictive, as I understand AMD may have predefined or >>>> parameterised curves that are not LUTs. So there should be room >>>> for an arbitrary structure of parameters, which can be passed in >>>> as a blob id, and the contents defined by the degamma mode. >>> >>> For Intel's hardware these are luts but yeah AMD hardware seems to >>> have these as fixed function units. We should think of a way to have >>> this option as well in the UAPI. We could extend the DEGAMMA_MODE >>> property to have all the info, and DEGAMMA_LUT_PROPERTY may not be >>> needed based on some of the attributes passed via DEGAMMA_MODE. This >>> can be >> one way to have room for both. >>> @harry.wentland@xxxxxxx thoughts ? >> >> Yeah, though I don't think you can use DEGAMMA_MODE property to pass >> parameters to fixed-function curves. The parameters need another >> property. Would be natural to use the other property for LUT data when mode it set to LUT. >> >> A trivial and made-up example of a parameterised fixed-function curve >> is pow(x, gamma), where gamma is the parameter. > > We can maybe have a parallel property as well with proper documentation if this helps with flexibility for varying hardware vendors. UAPI document will list what to use and when. > May be platform will not even list the other one which may make things less complicated to userspace. > >>>> >>>> LUT size, precision, and other details of each degamma mode would >>>> be good to expose somehow. I kind of expected those would have >>>> been exposed through the above mentioned "enum with blob id >>>> values" where each blob content structure is defined by the respective enum value. >>> >>> Yes, you are spot on here. >>> >>>>> + >>>>> +* Plane CTM >>>>> + * This is a Property to program the color transformation matrix. >>>> >>>> No mode property here? Is there any hardware with something else >>>> than a matrix at this point? >>> >>> Not that I am aware of. >>> >>>> Should we assume there will be hardware with something else, and >>>> have a CSC mode property with only a single enum value defined so far: >>>> "matrix"? Or do we say PLANE_CTM is a matrix and if you have >>>> something else in hardware, then invent a new property for it? >>> >>> I think this should be good as we have this for crtc with no one complaining. >>> There may be hardware with fixed function operation for the CSC but >>> then its not a matrix and it would be pretty hardware dependent, so >>> we can leave that from a generic UAPI. >> >> Ok. And if that eventually turns out to be a mistake, it's easy to >> invent more properties. > > I feel that is what would be required. This is just another instance of what we have at crtc level. > Would a userspace client ever want to use both CTM and 3D LUT? We could add a mode property to CTM to allow for a CTM or 3D LUT or we could add new properties for 3D LUT support. 3D LUT might come with shaper 1D LUTs that can be programmed in conjunction with the 3D LUT. 3D LUTs operating in non-linear space require less precision than 3D LUTs operating in linear space, hence the 1D LUT can be used to non-linearize the pixel data for 3D LUT operation. FWIW, AMD HW (depending on generation) can do these operations (in this order): 1) 1D LUT (fixed or PWL programmable) 2) simple multiplier (for scaling SDR content to HDR output) 3) CTM matrix 4) 1D LUT (shaper LUT to non-linearize for more effective 3D LUT transform) 5) 3D LUT 6) 1D LUT (for non-linear blending, or to linearize after 3D LUT) 7) blending 8) CTM matrix 9) 1D LUT (shaper LUT like above) 10) 3D LUT 11) 1D LUT (generally for EOTF^-1 for display EOTF) Not all blocks are available on all (current and future) HW. I sketched a few diagrams that show how these might be used by a compositor if we exposed all of these blocks and should really try to add some of them to the color-and-hdr docs repo. >>>>> + * This can be used to perform a color space conversion like >>>>> + * BT2020 to BT709 or BT601 etc. >>>>> + * This block is generally kept after the degamma unit so that >>>> >>>> Not "generally". If blocks can change places, then it becomes >>>> intractable for generic userspace to program. >>> >>> Sure, will drop this wording here. But one open will still remain >>> for userspace, as to how it gets the pipeline dynamically for a respective hardware. >>> Currently we have assumed that this would be the logical fixed order >>> of hardware units. >> >> If we cannot model the abstract KMS pipeline as a fixed order of units >> (where each unit may exist or not), we need to take a few steps back >> here and look at what do we actually want to expose. That is a much >> bigger design problem which we are currently not even considering. > > I think most of the hardware vendor platforms have this pipeline, so we can implement the properties which include all the possible hardware blocks. If certain units don't exist, the respective properties should not be exposed which will make things easier for userspace. I think the color pipeline should be modeled in a way that makes sense from a color science standpoint and in a way that makes sense for compositor implementations. Fortunately HW design generally aligns with these intentions but we should be careful to not let HW design dictate KMS interfaces. Harry > >>>>> + * linear data can be fed to it for conversion. >>>>> + >>>>> +UAPI Name: PLANE_CTM >>>>> +Description: Blob property which allows a userspace to provide >>>>> +CTM >> coefficients >>>>> + to do color space conversion or any other enhancement by doing a >>>>> + matrix multiplication using the h/w CTM processing engine >>>>> + >>>> >>>> Speaking of color space conversions, we should probably define >>>> what happens to out-of-range color values. Converting color into >>>> smaller gamut or smaller dynamic range always has the risk of >>>> ending up with out-of-range values. I suppose those get simply >>>> clipped independently on each >> color channel, right? >>> >>> We have standard matrix to convert colorspaces. This is just the >>> property of the colorspace, I guess clipping will be the only option >>> here (irrespective of hardware) >>> >>>> Such clipping can change hue, so userspace would be better avoid >>>> triggering clipping at all, but we still need to know what would >>>> happen with out- >> of-range values. >>>> >>>> We would also need to know when clipping will happen. If FP16 >>>> (half-float) FB produces out-of-range values and degamma stage is >>>> not used, will the CTM see original or clipped values? Or is that >>>> something we have to define as hardware-specific? >>>> >>>> Generic userspace will try hard to avoid triggering >>>> hardware-specific behaviour, so you can expect such behaviour to go unused. >>> >>> Here hardware should just operate on the matrix values programmed. >>> This should be the limitation of the color space conversion >>> operation and hardware may not have any role to play apart from just >>> honoring the matrix values and producing the resultant output. >> >> I'm not talking about the matrix values. I'm talking about the values >> originating from the FB, are they clipped or not, before being processed by the matrix? >> >> Userspace could want to use the matrix to bring out-of-range FB values >> into range, but that plan cannot work if the FB values get clipped already before the matrix. >> >> I referred to FP16 format explicitly because that format can express >> values outside of the usual [0.0, 1.0] range in a framebuffer. >> >> Of course, the matrix values themselves have some limits too, and >> userspace should somehow be aware of them. > > AFAIK, the values will not be clipped before being submitted to CSC blocks. > But this can be confirmed to not leave any misconception here. > >>> >>>>> +* Plane Gamma or Post-Curve >>>>> + * This can be used to perform 2 operations: >>>>> + * non-lineralize the framebuffer data. Can be used for >>>>> + * non linear blending. It can be a gamma curve or EOTF >>>>> + * for HDR. >>>>> + * Perform Tone Mapping operation. This is an operation >>>>> + * done when blending is done with HDR and SDR content. >>>> >>>> I like this wording better than the wording for pre-curve: "can", >>>> not "will". It leaves room for creative use of this processing block. >>> >>> Ok thanks, will update pre-curve section as well. >>> >>>> >>>> Tone-mapping is needed always when dynamic range differs, so also >>>> for HDR to HDR, not just SDR to/from HDR. >>> >>> Yes correct, will update. >>> >>>>> + >>>>> +UAPI Name: PLANE_GAMMA_MODE >>>>> +Description: Enum property with values as blob_id's which advertizes the >>>>> + possible gamma modes and lut ranges supported by the platform. >>>>> + This allows userspace to query and get the plane gamma color >>>>> + caps and choose the appropriate gamma mode and create lut values >>>>> + accordingly. >>>>> + >>>>> +UAPI Name: PLANE_GAMMA_LUT >>>>> +Description: Blob property which allows a userspace to provide LUT values >>>>> + to apply gamma curve or perform tone mapping using the h/w plane >>>>> + gamma processing engine, thereby making the content as linear >>>>> + for further color processing. Userspace gets the size of LUT and >>>>> + precision etc from PLANE_GAMA_MODE_PROPERTY >>>> >>>> The same comments here as with DEGAMMA. >>> >>> Noted. >>> >>>>> + >>>>> +This is part of one plane engine. Data from multiple planes >>>>> +will be then fed to pipe where it will get blended. There is a >>>>> +similar set of properties available at crtc level which acts on this blended data. >>>>> + >>>>> +Below is a sample usecase: >>>>> + >>>>> + ┌────────────┐ ┌─────────────┐ ┌─────────────┐ >>>> ┌─────────────┐ >>>>> + │FB1 │ │Degamma Block│ │ CTM Matrix │ │ Gamma Block │ >>>>> + │ ├─────►│Linearize- ├────►│ BT709 to ├────►│ SDR to HDR >> │ >>>>> + │BT709 SDR │ │BT709 inverse│ │ BT2020 │ │ Tone >>>> Mapping├────────┐ >>>>> + └────────────┘ └─────────────┘ └─────────────┘ >>>> └─────────────┘ │ >>>>> + │ >>>>> + ┌────────────┐ ┌─────────────┐ ┌─────────────┐ >>>> ┌─────────────┐ │ >>>>> + │FB2 │ │Degamma Block│ │ CTM Matrix │ │ Gamma Block │ >> │ >>>>> + │ ├─────►│Linearize- ├────►│ BT601 to ├────►│ SDR to HDR >>>> ├─────┐ │ >>>>> + │BT601 SDR │ │BT601 inverse│ │ BT2020 │ │ Tone Mapping│ │ >> │ >>>>> + └────────────┘ └─────────────┘ └─────────────┘ >>>> └─────────────┘ │ │ >>>>> + │ │ >>>>> + ┌────────────┐ ┌─────────────┐ ┌─────────────┐ >>>> ┌─────────────┐ │ │ >>>>> + │FB3 │ │Degamma Block│ │ CTM Matrix │ │ Gamma Block │ │ >> │ >>>>> + │ ├─────►│Linearize- ├────►│ NOP (Data in├────►│ NOP (Data >>>> in├───┐ │ │ >>>>> + │BT2020 HDR │ │HDR OETF │ │ BT2020) │ │ HDR) │ │ │ │ >>>>> + └────────────┘ └─────────────┘ └─────────────┘ >>>> └─────────────┘ │ │ │ >>>> >>>> EOTF, not OETF, since it is converting E to O, electrical to optical. >>> >>> I think media decode would have given a EOTF applied data to be >>> directly consumed by display sink (in case we have chosen a display pass through). >> Not sure here though, it’s a bit confusing. >> >> I hope I managed to explain this earlier in this email. >> >> Display sinks "always" use non-linearly encoded values, because it >> saves bandwidth without losing visual quality. For the same reason, >> images tend to be stored with non-linear value encoding whenever possible. > > Yeah, thanks for the explanation and useful pointers. Will work on it to put the right expectations in the document. > > Thanks Pekka for the valuable feedback. > > Regards, > Uma Shankar >> >> Thanks, >> pq >> >>>>> + >>>>> + │ │ │ >>>>> + >>>>> + │ │ │ >>>>> + >>>>> +│ │ │ >>>>> >>>> >> +┌────────────────────────────────────────────────────────────────── >>>> ── >>>>> +───────────┴─┴──┘ >>>>> +│ >>>>> +│ ┌─────────────┐ ┌─────────────┐ ┌───────────────┐ >>>>> +│ │ CRTC Degamma│ │ CRTC CTM │ │ CRTC Gamma │ >>>>> +└─┤ Use to make ├─────►│ Use for any ├─────►│ Use for Tone >>>>> +├─────► >>>> TO Port >>>>> + │ data linear │ │ Color Space │ │ Mapping/apply │ >>>>> + │ after blend │ │ Conversion │ │ transfer func │ >>>>> + └─────────────┘ └─────────────┘ └───────────────┘ >>>> >>>> Blending does not change whether the data is linear or not. I >>>> suppose in this example, CRTC degamma and CTM would be >>>> passthrough, and gamma would be the inverse display EOTF for >>>> encoding color values into >> what the monitor expects. >>> >>> Yeah, will update this to make it clear. >>> >>>>> + >>>>> + >>>>> +This patch series adds properties for plane color features. It >>>>> +adds properties for degamma used to linearize data and CSC used >>>>> +for gamut conversion. It also includes Gamma support used to >>>>> +again non-linearize data as per panel supported color space. >>>>> +These can be utilize by user space to convert planes from one >>>>> +format to another, one color space to another etc. >>>> >>>> FWIW, this is exactly the structure I have assumed in the Weston CM&HDR work. >>> >>> This is great to hear that we are aligned wrt how the pipeline should work. >>> >>> Thanks Pekka for taking time out and providing the feedback. >>> >>> @harry.wentland@xxxxxxx We can work together and build our design to >>> accommodate both Intel and AMD's hardware needs. This will also make >>> things generic enough for any other hardware vendor as well. >>> >>> Thanks & Regards, >>> Uma Shankar >>> >>>>> + >>>>> +Userspace can take smart blending decisions and utilize these >>>>> +hardware supported plane color features to get accurate color >>>>> +profile. The same can help in consistent color quality from >>>>> +source to panel taking advantage of advanced color features in hardware. >>>>> + >>>>> +These patches add the property interfaces and enable helper functions. >>>>> +This series adds Intel's XE_LPD hw specific plane gamma feature. >>>>> +We can build up and add other platform/hardware specific >>>>> +implementation on top of this series. >>>>> + >>>>> +Credits: Special mention and credits to Ville Syrjala for >>>>> +coming up with a design for this feature and inputs. This >>>>> +series is based on his original design and idea. >>>> >>>> >>>> Thanks, >>>> pq >