On Thu, Nov 07, 2019 at 03:31:28PM +0000, Kazlauskas, Nicholas wrote: > On 2019-11-07 10:17 a.m., Ville Syrjala wrote: > > From: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> > > > > This thing can get called several thousand times per LUT > > so seems like we want to inline it to: > > - avoid the function call overhead > > - allow constant folding > > > > A quick synthetic test (w/o any hardware interaction) with > > a ridiculously large LUT size shows about 50% reduction in > > runtime on my HSW and BSW boxes. Slightly less with more > > reasonable LUT size but still easily measurable in tens > > of microseconds. > > > > Signed-off-by: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> > > Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@xxxxxxx> > > Seems reasonable to me. It would probably make sense to even split this > further into two functions, one for high precision and one for low > precision so it's purely a calculation and not hitting any branches. Constant folding gets rid of it. > > Nicholas Kazlauskas > > > --- > > drivers/gpu/drm/drm_color_mgmt.c | 24 ------------------------ > > include/drm/drm_color_mgmt.h | 23 ++++++++++++++++++++++- > > 2 files changed, 22 insertions(+), 25 deletions(-) > > > > diff --git a/drivers/gpu/drm/drm_color_mgmt.c b/drivers/gpu/drm/drm_color_mgmt.c > > index 4ce5c6d8de99..19c5f635992a 100644 > > --- a/drivers/gpu/drm/drm_color_mgmt.c > > +++ b/drivers/gpu/drm/drm_color_mgmt.c > > @@ -108,30 +108,6 @@ > > * standard enum values supported by the DRM plane. > > */ > > > > -/** > > - * drm_color_lut_extract - clamp and round LUT entries > > - * @user_input: input value > > - * @bit_precision: number of bits the hw LUT supports > > - * > > - * Extract a degamma/gamma LUT value provided by user (in the form of > > - * &drm_color_lut entries) and round it to the precision supported by the > > - * hardware. > > - */ > > -uint32_t drm_color_lut_extract(uint32_t user_input, uint32_t bit_precision) > > -{ > > - uint32_t val = user_input; > > - uint32_t max = 0xffff >> (16 - bit_precision); > > - > > - /* Round only if we're not using full precision. */ > > - if (bit_precision < 16) { > > - val += 1UL << (16 - bit_precision - 1); > > - val >>= 16 - bit_precision; > > - } > > - > > - return clamp_val(val, 0, max); > > -} > > -EXPORT_SYMBOL(drm_color_lut_extract); > > - > > /** > > * drm_crtc_enable_color_mgmt - enable color management properties > > * @crtc: DRM CRTC > > diff --git a/include/drm/drm_color_mgmt.h b/include/drm/drm_color_mgmt.h > > index d1c662d92ab7..069b21d61871 100644 > > --- a/include/drm/drm_color_mgmt.h > > +++ b/include/drm/drm_color_mgmt.h > > @@ -29,7 +29,28 @@ > > struct drm_crtc; > > struct drm_plane; > > > > -uint32_t drm_color_lut_extract(uint32_t user_input, uint32_t bit_precision); > > +/** > > + * drm_color_lut_extract - clamp and round LUT entries > > + * @user_input: input value > > + * @bit_precision: number of bits the hw LUT supports > > + * > > + * Extract a degamma/gamma LUT value provided by user (in the form of > > + * &drm_color_lut entries) and round it to the precision supported by the > > + * hardware. > > + */ > > +static inline u32 drm_color_lut_extract(u32 user_input, int bit_precision) > > +{ > > + u32 val = user_input; > > + u32 max = 0xffff >> (16 - bit_precision); > > + > > + /* Round only if we're not using full precision. */ > > + if (bit_precision < 16) { > > + val += 1UL << (16 - bit_precision - 1); > > + val >>= 16 - bit_precision; > > + } > > + > > + return clamp_val(val, 0, max); > > +} > > > > void drm_crtc_enable_color_mgmt(struct drm_crtc *crtc, > > uint degamma_lut_size, > > > -- Ville Syrjälä Intel _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx