On Thu, Feb 01, 2024 at 12:03:20PM -0500, Sasha Levin wrote: > This is a note to let you know that I've just added the patch titled > > drm: Fix color LUT rounding > > to the 6.7-stable tree which can be found at: > http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary > > The filename of the patch is: > drm-fix-color-lut-rounding.patch > and it can be found in the queue-6.7 subdirectory. > > If you, or anyone else, feels it should not be added to the stable tree, > please let <stable@xxxxxxxxxxxxxxx> know about it. I guess I wasn't clear enough in the other mail... NAK for all of backports of this patch. > > > > commit 09bb383f88338316b11ca2270176a3439efef22f > Author: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> > Date: Fri Oct 13 16:13:59 2023 +0300 > > drm: Fix color LUT rounding > > [ Upstream commit c6fbb6bca10838485b820e8a26c23996f77ce580 ] > > The current implementation of drm_color_lut_extract() > generates weird results. Eg. if we go through all the > values for 16->8bpc conversion we see the following pattern: > > in out (count) > 0 - 7f -> 0 (128) > 80 - 17f -> 1 (256) > 180 - 27f -> 2 (256) > 280 - 37f -> 3 (256) > ... > fb80 - fc7f -> fc (256) > fc80 - fd7f -> fd (256) > fd80 - fe7f -> fe (256) > fe80 - ffff -> ff (384) > > So less values map to 0 and more values map 0xff, which > doesn't seem particularly great. > > To get just the same number of input values to map to > the same output values we'd just need to drop the rounding > entrirely. But perhaps a better idea would be to follow the > OpenGL int<->float conversion rules, in which case we get > the following results: > > in out (count) > 0 - 80 -> 0 (129) > 81 - 181 -> 1 (257) > 182 - 282 -> 2 (257) > 283 - 383 -> 3 (257) > ... > fc7c - fd7c -> fc (257) > fd7d - fe7d -> fd (257) > fe7e - ff7e -> fe (257) > ff7f - ffff -> ff (129) > > Note that since the divisor is constant the compiler > is able to optimize away the integer division in most > cases. The only exception is the _ULL() case on 32bit > architectures since that gets emitted as inline asm > via do_div() and thus the compiler doesn't get to > optimize it. > > Signed-off-by: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> > Link: https://patchwork.freedesktop.org/patch/msgid/20231013131402.24072-2-ville.syrjala@xxxxxxxxxxxxxxx > Reviewed-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@xxxxxxxxx> > Reviewed-by: Jani Nikula <jani.nikula@xxxxxxxxx> > Acked-by: Maxime Ripard <mripard@xxxxxxxxxx> > Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx> > > diff --git a/include/drm/drm_color_mgmt.h b/include/drm/drm_color_mgmt.h > index 81c298488b0c..54b2b2467bfd 100644 > --- a/include/drm/drm_color_mgmt.h > +++ b/include/drm/drm_color_mgmt.h > @@ -36,20 +36,17 @@ struct drm_plane; > * > * Extract a degamma/gamma LUT value provided by user (in the form of > * &drm_color_lut entries) and round it to the precision supported by the > - * hardware. > + * hardware, following OpenGL int<->float conversion rules > + * (see eg. OpenGL 4.6 specification - 2.3.5 Fixed-Point Data Conversions). > */ > static inline u32 drm_color_lut_extract(u32 user_input, int bit_precision) > { > - u32 val = user_input; > - u32 max = 0xffff >> (16 - bit_precision); > - > - /* Round only if we're not using full precision. */ > - if (bit_precision < 16) { > - val += 1UL << (16 - bit_precision - 1); > - val >>= 16 - bit_precision; > - } > - > - return clamp_val(val, 0, max); > + if (bit_precision > 16) > + return DIV_ROUND_CLOSEST_ULL(mul_u32_u32(user_input, (1 << bit_precision) - 1), > + (1 << 16) - 1); > + else > + return DIV_ROUND_CLOSEST(user_input * ((1 << bit_precision) - 1), > + (1 << 16) - 1); > } > > u64 drm_color_ctm_s31_32_to_qm_n(u64 user_input, u32 m, u32 n); -- Ville Syrjälä Intel