On 2024-10-04 07:43, Louis Chauvet wrote: > On 03/10/24 - 16:01, Harry Wentland wrote: >> Certain operations require us to preserve values below 0.0 and >> above 1.0 (0x0 and 0xffff respectively in 16 bpc unorm). One >> such operation is a BT709 encoding operation followed by its >> decoding operation, or the reverse. >> >> We'll use s32 values as intermediate in and outputs of our >> color operations, for the operations where it matters. >> >> For now this won't apply to LUT operations. We might want to >> update those to work on s32 as well, but it's unclear how >> that should work for unorm LUT definitions. We'll revisit >> that once we add LUT + CTM tests. >> >> In order to allow for this we'll also invert the nesting of our >> colorop processing loops. We now use the pixel iteration loop >> on the outside and the colorop iteration on the inside. >> >> Signed-off-by: Harry Wentland <harry.wentland@xxxxxxx> >> --- >> v6: >> - use clamp_val instead of manual clamping (Louis Chauvet) > > Perfect! > >> v4: >> - Clarify that we're packing 16-bit UNORM into s32, not >> converting values to a different representation (Pekka) >> >> v3: >> - Use new colorop->next pointer >> >> drivers/gpu/drm/vkms/vkms_composer.c | 27 +++++++++++++++++++++++++-- >> drivers/gpu/drm/vkms/vkms_drv.h | 4 ++++ >> 2 files changed, 29 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/gpu/drm/vkms/vkms_composer.c b/drivers/gpu/drm/vkms/vkms_composer.c >> index b4aaad2bf45f..01fa81ebc93b 100644 >> --- a/drivers/gpu/drm/vkms/vkms_composer.c >> +++ b/drivers/gpu/drm/vkms/vkms_composer.c >> @@ -159,7 +159,7 @@ static void apply_lut(const struct vkms_crtc_state *crtc_state, struct line_buff >> } >> } >> >> -static void apply_colorop(struct pixel_argb_u16 *pixel, struct drm_colorop *colorop) >> +static void apply_colorop(struct pixel_argb_s32 *pixel, struct drm_colorop *colorop) > > I agree with this change, but I think we may face conversion issues > in apply_lut_to_channel_value, as it takes u16 and returns u16, but with > your change, you pass s32 and expect s32. > apply_lut() still passes and expects a u16. apply_colorop() should be fine with passing/getting u16 values to apply_lut_to_channel_value(). The only way I could see this become an issue is if someone uses a CTM that results in negative or greater than 16-bit (1.0f) values that then feed into a LUT. But I'm not sure how realistic such a use-case is and would rather address this when we see a need for it. An IGT test that creates such a scenario would fail, but we don't have such a test currently. I also don't have time to dig into it, rework apply_lut_to_channel_value and do correct clipping where needed and I'd rather not touch it unless I have time to be thorough with it. So, I think it's fine for now but we might want to rework it at some point. Harry > If it is not an issue: Reviewed-by: Louis Chauvet <louis.chauvet@xxxxxxxxxxx> > >> { >> struct drm_colorop_state *colorop_state = colorop->state; >> >> @@ -185,9 +185,26 @@ static void apply_colorop(struct pixel_argb_u16 *pixel, struct drm_colorop *colo >> >> static void pre_blend_color_transform(const struct vkms_plane_state *plane_state, struct line_buffer *output_buffer) >> { >> + struct pixel_argb_s32 pixel; >> + >> for (size_t x = 0; x < output_buffer->n_pixels; x++) { >> struct drm_colorop *colorop = plane_state->base.base.color_pipeline; >> >> + /* >> + * Some operations, such as applying a BT709 encoding matrix, >> + * followed by a decoding matrix, require that we preserve >> + * values above 1.0 and below 0.0 until the end of the pipeline. >> + * >> + * Pack the 16-bit UNORM values into s32 to give us head-room to >> + * avoid clipping until we're at the end of the pipeline. Clip >> + * intentionally at the end of the pipeline before packing >> + * UNORM values back into u16. >> + */ >> + pixel.a = output_buffer->pixels[x].a; >> + pixel.r = output_buffer->pixels[x].r; >> + pixel.g = output_buffer->pixels[x].g; >> + pixel.b = output_buffer->pixels[x].b; >> + >> while (colorop) { >> struct drm_colorop_state *colorop_state; >> >> @@ -197,10 +214,16 @@ static void pre_blend_color_transform(const struct vkms_plane_state *plane_state >> return; >> >> if (!colorop_state->bypass) >> - apply_colorop(&output_buffer->pixels[x], colorop); >> + apply_colorop(&pixel, colorop); >> >> colorop = colorop->next; >> } >> + >> + /* clamp values */ >> + output_buffer->pixels[x].a = clamp_val(pixel.a, 0, 0xffff); >> + output_buffer->pixels[x].r = clamp_val(pixel.r, 0, 0xffff); >> + output_buffer->pixels[x].g = clamp_val(pixel.g, 0, 0xffff); >> + output_buffer->pixels[x].b = clamp_val(pixel.b, 0, 0xffff); >> } >> } >> >> diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h >> index 278cf3533f58..b78bc2611746 100644 >> --- a/drivers/gpu/drm/vkms/vkms_drv.h >> +++ b/drivers/gpu/drm/vkms/vkms_drv.h >> @@ -36,6 +36,10 @@ struct vkms_frame_info { >> unsigned int cpp; >> }; >> >> +struct pixel_argb_s32 { >> + s32 a, r, g, b; >> +}; >> + >> struct pixel_argb_u16 { >> u16 a, r, g, b; >> }; >> -- >> 2.46.2 >>