Patch "drm: Fix color LUT rounding" has been added to the 6.6-stable tree

Sasha Levin <sashal@xxxxxxxxxx> · Thu, 1 Feb 2024 12:15:41 -0500

This is a note to let you know that I've just added the patch titled

    drm: Fix color LUT rounding

to the 6.6-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     drm-fix-color-lut-rounding.patch
and it can be found in the queue-6.6 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit 5ca610239fb933a0793ab29761ef568f79fa3cdd
Author: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx>
Date:   Fri Oct 13 16:13:59 2023 +0300

    drm: Fix color LUT rounding
    
    [ Upstream commit c6fbb6bca10838485b820e8a26c23996f77ce580 ]
    
    The current implementation of drm_color_lut_extract()
    generates weird results. Eg. if we go through all the
    values for 16->8bpc conversion we see the following pattern:
    
    in            out (count)
       0 -   7f ->  0 (128)
      80 -  17f ->  1 (256)
     180 -  27f ->  2 (256)
     280 -  37f ->  3 (256)
    ...
    fb80 - fc7f -> fc (256)
    fc80 - fd7f -> fd (256)
    fd80 - fe7f -> fe (256)
    fe80 - ffff -> ff (384)
    
    So less values map to 0 and more values map 0xff, which
    doesn't seem particularly great.
    
    To get just the same number of input values to map to
    the same output values we'd just need to drop the rounding
    entrirely. But perhaps a better idea would be to follow the
    OpenGL int<->float conversion rules, in which case we get
    the following results:
    
    in            out (count)
       0 -   80 ->  0 (129)
      81 -  181 ->  1 (257)
     182 -  282 ->  2 (257)
     283 -  383 ->  3 (257)
    ...
    fc7c - fd7c -> fc (257)
    fd7d - fe7d -> fd (257)
    fe7e - ff7e -> fe (257)
    ff7f - ffff -> ff (129)
    
    Note that since the divisor is constant the compiler
    is able to optimize away the integer division in most
    cases. The only exception is the _ULL() case on 32bit
    architectures since that gets emitted as inline asm
    via do_div() and thus the compiler doesn't get to
    optimize it.
    
    Signed-off-by: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx>
    Link: https://patchwork.freedesktop.org/patch/msgid/20231013131402.24072-2-ville.syrjala@xxxxxxxxxxxxxxx
    Reviewed-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@xxxxxxxxx>
    Reviewed-by: Jani Nikula <jani.nikula@xxxxxxxxx>
    Acked-by: Maxime Ripard <mripard@xxxxxxxxxx>
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/include/drm/drm_color_mgmt.h b/include/drm/drm_color_mgmt.h
index 81c298488b0c..54b2b2467bfd 100644
--- a/include/drm/drm_color_mgmt.h
+++ b/include/drm/drm_color_mgmt.h
@@ -36,20 +36,17 @@ struct drm_plane;
  *
  * Extract a degamma/gamma LUT value provided by user (in the form of
  * &drm_color_lut entries) and round it to the precision supported by the
- * hardware.
+ * hardware, following OpenGL int<->float conversion rules
+ * (see eg. OpenGL 4.6 specification - 2.3.5 Fixed-Point Data Conversions).
  */
 static inline u32 drm_color_lut_extract(u32 user_input, int bit_precision)
 {
-	u32 val = user_input;
-	u32 max = 0xffff >> (16 - bit_precision);
-
-	/* Round only if we're not using full precision. */
-	if (bit_precision < 16) {
-		val += 1UL << (16 - bit_precision - 1);
-		val >>= 16 - bit_precision;
-	}
-
-	return clamp_val(val, 0, max);
+	if (bit_precision > 16)
+		return DIV_ROUND_CLOSEST_ULL(mul_u32_u32(user_input, (1 << bit_precision) - 1),
+					     (1 << 16) - 1);
+	else
+		return DIV_ROUND_CLOSEST(user_input * ((1 << bit_precision) - 1),
+					 (1 << 16) - 1);
 }
 
 u64 drm_color_ctm_s31_32_to_qm_n(u64 user_input, u32 m, u32 n);