On 2021-11-05 07:49, Ville Syrjälä wrote: > On Thu, Nov 04, 2021 at 12:27:56PM -0400, Harry Wentland wrote: >> >> >> On 2021-11-04 04:38, Pekka Paalanen wrote: >>> On Wed, 3 Nov 2021 11:08:13 -0400 >>> Harry Wentland <harry.wentland@xxxxxxx> wrote: >>> >>>> On 2021-09-06 17:38, Uma Shankar wrote: >>>>> Existing LUT precision structure is having only 16 bit >>>>> precision. This is not enough for upcoming enhanced hardwares >>>>> and advance usecases like HDR processing. Hence added a new >>>>> structure with 32 bit precision values. >>>>> >>>>> This also defines a new structure to define color lut ranges, >>>>> along with related macro definitions and enums. This will help >>>>> describe multi segmented lut ranges in the hardware. >>>>> >>>>> Signed-off-by: Uma Shankar <uma.shankar@xxxxxxxxx> >>>>> --- >>>>> include/uapi/drm/drm_mode.h | 58 +++++++++++++++++++++++++++++++++++++ >>>>> 1 file changed, 58 insertions(+) >>>>> >>>>> diff --git a/include/uapi/drm/drm_mode.h b/include/uapi/drm/drm_mode.h >>>>> index 90c55383f1ee..1079794c86c3 100644 >>>>> --- a/include/uapi/drm/drm_mode.h >>>>> +++ b/include/uapi/drm/drm_mode.h >>>>> @@ -903,6 +903,64 @@ struct hdr_output_metadata { >>>>> }; >>>>> }; >>>>> >>>>> +/* >>>>> + * DRM_MODE_LUT_GAMMA|DRM_MODE_LUT_DEGAMMA is legal and means the LUT >>>>> + * can be used for either purpose, but not simultaneously. To expose >>>>> + * modes that support gamma and degamma simultaneously the gamma mode >>>>> + * must declare distinct DRM_MODE_LUT_GAMMA and DRM_MODE_LUT_DEGAMMA >>>>> + * ranges. >>>>> + */ >>>>> +/* LUT is for gamma (after CTM) */ >>>>> +#define DRM_MODE_LUT_GAMMA BIT(0) >>>>> +/* LUT is for degamma (before CTM) */ >>>>> +#define DRM_MODE_LUT_DEGAMMA BIT(1) >>>>> +/* linearly interpolate between the points */ >>>>> +#define DRM_MODE_LUT_INTERPOLATE BIT(2) >>>>> +/* >>>>> + * the last value of the previous range is the >>>>> + * first value of the current range. >>>>> + */ >>>>> +#define DRM_MODE_LUT_REUSE_LAST BIT(3) >>>>> +/* the curve must be non-decreasing */ >>>>> +#define DRM_MODE_LUT_NON_DECREASING BIT(4) >>>>> +/* the curve is reflected across origin for negative inputs */ >>>>> +#define DRM_MODE_LUT_REFLECT_NEGATIVE BIT(5) >>>>> +/* the same curve (red) is used for blue and green channels as well */ >>>>> +#define DRM_MODE_LUT_SINGLE_CHANNEL BIT(6) >>>>> + >>>>> +struct drm_color_lut_range { >>>>> + /* DRM_MODE_LUT_* */ >>>>> + __u32 flags; >>>>> + /* number of points on the curve */ >>>>> + __u16 count; >>>>> + /* input/output bits per component */ >>>>> + __u8 input_bpc, output_bpc; >>>>> + /* input start/end values */ >>>>> + __s32 start, end; >>>>> + /* output min/max values */ >>>>> + __s32 min, max; >>>>> +}; >>>>> + >>>>> +enum lut_type { >>>>> + LUT_TYPE_DEGAMMA = 0, >>>>> + LUT_TYPE_GAMMA = 1, >>>>> +}; >>>>> + >>>>> +/* >>>>> + * Creating 64 bit palette entries for better data >>>>> + * precision. This will be required for HDR and >>>>> + * similar color processing usecases. >>>>> + */ >>>>> +struct drm_color_lut_ext { >>>>> + /* >>>>> + * Data is U32.32 fixed point format. >>>>> + */ >>>>> + __u64 red; >>>>> + __u64 green; >>>>> + __u64 blue; >>>>> + __u64 reserved; >>>>> +}; >>>> >>>> I've been drawing out examples of drm_color_lut_range defined PWLs >>>> and understand a bit better what you and Ville are trying to accomplish >>>> with it. It actually makes a lot of sense and would allow for a generic >>>> way to populate different PWL definitions with a generic function. >>>> >>>> But I still have some key questions that either are not answered in these >>>> patch sets or that I somehow overlooked. >>>> >>>> Can you explain how the U32.32 fixed point format relates to the input_bpc >>>> and output_bpc in drm_color_lut_range, as we as to the pixel coming in from >>>> the framebuffer. >>>> >>>> E.g. if we have ARGB2101010 what happens to a 0x3ff red value (assuming alpha >>>> is non-multiplied)? >>>> >>>> If the drm_color_lut_range segments are defined with input_bpc of 24 bpc will >>>> 0x3ff be zero-expanded to 24-bit? Is the 24 bpc an integer? I.e. would our 3xff >>>> be interpreted as 0x3ff << (24-10)? >>>> >>>> Assuming the output_bpc is 16 bpc and the programmed LUT makes this 1-to-1 would >>>> the output value be 0x3ff << (16-10)? >>>> >>>> On AMD HW the pipe-internal format is a custom floating point format. We could >>>> probably express that in terms of input/output_bpc and do the translation in our >>>> driver between that and the internal floating point format, depending on the >>>> framebuffer format, but there is the added complication of the magnitude of the >>>> pixel data and correlating HDR with SDR planes. >>>> >>>> E.g. any SDR data would map from 0.0 to 1.0 floating point, while HDR content would >>>> map from 0.0 to some value larger than 1.0. I don't (yet) have a clear picture how >>>> to represent that with the drm_color_lut_range definition. >>> >>> >>> Hi Harry, >>> >>> I think you just would not. Conceptually an SDR plane gets its very own >>> LUT that converts the FB [0.0, 1.0] range to any appropriate [a >= 0.0, >>> b <= 1.0] range in HDR. This is purely conceptual, in the terms of the >>> abstract KMS color pipeline, where [0.0, 1.0] is always the full >>> dynamic range at any point of the pipeline, meaning it is relative to >>> its placement in the pipeline. If you want to use values >1.0 in hw, >>> you can do so under the hood. >>> >>> At least that is how I would imagine things. With LUTs in general, I >>> don't think I have ever seen LUT input domain being explicitly defined >>> to something else than [0.0, 1.0] relative to the elements in the LUT >>> where 0.0 maps exactly to the first element and 1.0 maps exactly to the >>> last element. >>> >>> I'm of course open to other suggestions, but having values outside of >>> [0.0, 1.0] range in the abstract pipeline will always raise the >>> question: how do you feed those to the LUT next in the pipeline. >>> >> >> AMD HW defines the LUT addressing in floating point space and allows >> for addressing beyond 1.0. In fact on other OSes our driver uses >> [0.0, 1.0] for SDR LUTs and [0.0, 128.0] for HDR LUTs. >> >> There are color spaces that extend beyond 1.0 and even into the negative >> range: https://en.wikipedia.org/wiki/ScRGB>>> >> I don't think we should define the LUT to be limited to [0.0, 1.0]. > > That is not the intention, or at least wasn't my intention when > originally I proposed this gamma mode stuff. I specifically wanted > support for extended values. So 0.0-1.0 is supposed to be just the > range for the in gamut values. > Make sense and good to hear. >> >> If the framebuffer is not in FP16 the question then becomes how >> the integer pixel values relate to LUT addressing. > > The idea again is that 0.0-1.0 is the range for the in gamut > values. IIRC our hw does have the possibility of scaling all > the fp16 input values by some programmable constant factor, > but exposing that would require yet another uapi addition. > >> >> As well, LUT entries are defined to be U32.32 fixed point, also >> allowing for entries much greater than 1.0. If those are programmed >> as entries in the input (degamma) LUT how will they be used to address >> entries in the output (gamma) LUT? > > The u32.32 is a mistake I think. IMO we should be going for signed > values everywhere. Though our hw does not actually let us directly > program negative values for the LUT, rather the hw just reflects > the whole curve across the origin so the lookup is basically just > something like: > output = input < 0 ? -lut[abs(input)] : lut[input]; > > That is why there is that proposed DRM_MODE_LUT_REFLECT_NEGATIVE flag. > That's pretty much how I expect AMD HW to operate as well. Harry > I think nouveau had something funny in its lut programming that > made me think that it might actually have straight up programmable > LUT entries for negative inputs as well. But I'm not sure if anyone > has actually tested that. > >> >> Maybe we want to say the actual input values are being used to >> address the LUT entries? But if we look at segment 1 of Uma's >> d13_degamma_hdr definition we see that the range of 0 to >> (1 << 24) -1 is covered by 128 (1 << 7) entries, so the range >> of 0 to 256 (for RGB with 8 bpc) would only be covered by 2 >> LUT entries. So this interpretation doesn't make sense. >> >> You can see, I'm still confused by these definitions. An IGT >> test would help explain the API intentions a bit better, though >> I would like to see more precise documentation. >> >> Despite my confusion I think the segmented PWL definitions are >> a neat way to concisely describe the HW capabilities to >> userspace and a concise way for userspace to provide the LUT >> more precisely than with a uniformly spaced LUT. >> >> Harry >> >>> Yeah, I have no idea what it should mean if an FB has a format that >>> allows values beyond [0.0, 1.0]. >>> >>> >>> Thanks, >>> pq >>> >>> >>>> If some of these questions should be obvious I apologize for being a bit dense, >>>> though it helps to make this accessible to the lowest common denominator >>>> to ensure not only the smartest devs can work with this. >>>> >>>> Harry >>>> >>>>> + >>>>> #define DRM_MODE_PAGE_FLIP_EVENT 0x01 >>>>> #define DRM_MODE_PAGE_FLIP_ASYNC 0x02 >>>>> #define DRM_MODE_PAGE_FLIP_TARGET_ABSOLUTE 0x4 >>>>> >>>> >>> >