On April 26, 2022 10:03:09 PM GMT-03:00, Igor Torrente <igormtorrente@xxxxxxxxx> wrote: > > >On 4/25/22 22:54, Igor Torrente wrote: >> Hi Pekka, >> >> On 4/25/22 05:10, Pekka Paalanen wrote: >>> On Sat, 23 Apr 2022 15:53:20 -0300 >>> Igor Torrente <igormtorrente@xxxxxxxxx> wrote: >>> >>>> I forgot to respond some points from your review. >>>> >>>> On 4/23/22 13:04, Igor Torrente wrote: >>>>> Hi Pekka, >>>>> >>>>> On 4/20/22 09:36, Pekka Paalanen wrote: >>>>>> On Mon, 4 Apr 2022 17:45:12 -0300 >>>>>> Igor Torrente <igormtorrente@xxxxxxxxx> wrote: >>>>>> >>>>>>> Currently the blend function only accepts XRGB_8888 and ARGB_8888 >>>>>>> as a color input. >>>>>>> >>>>>>> This patch refactors all the functions related to the plane composition >>>>>>> to overcome this limitation. >>>>>>> >>>>>>> A new internal format(`struct pixel`) is introduced to deal with all >>>>>> >>>>>> Hi, >>>>>> >>>>>> struct pixel_argb_u16 was added in the previous patch. >>>>> >>>>> I will fix it. Thanks! >>> >>> ... >>> >>>>>>> +static int compose_active_planes(struct vkms_writeback_job *active_wb, >>>>>>> + struct vkms_crtc_state *crtc_state, >>>>>>> + u32 *crc32) >>>>>>> { >>>>>>> + int line_width, ret = 0, pixel_size = sizeof(struct pixel_argb_u16); >>>>>>> + struct vkms_frame_info *primary_plane_info = NULL; >>>>>>> + struct line_buffer output_buffer, stage_buffer; >>>>>>> + struct vkms_plane_state *act_plane = NULL; >>>>>>> + u32 wb_format; >>>>>>> + if (WARN_ON(pixel_size != 8)) >>>>>> >>>>>> Isn't there a compile-time assert macro for this? Having to actually >>>>>> run VKMS to check for this reduces the chances of finding it early. >>>>>> What's the reason for this check anyway? >>>> >>>> Yes, it exists. >>>> >>>> include/linux/build_bug.h:1:#define static_assert(expr, ...) >>>> __static_assert(expr, ##__VA_ARGS__, #expr) >>>> >>>> I didn't add it because I can imagine some people very mad if the kernel >>>> did not compile because of vkms. >>> >>> But that would mean that VKMS is broken on those platforms. You'd >>> better know which platforms VKMS is broken, so the Kconfig can stop >>> VKMS from being built there at all. Or better, fix it before anyone >>> needs VKMS there. >> >> Right. Makes sense. I will add it then. >> >>> >>>> This check exists so we can call `crc32_le` for the entire line instead >>>> doing it for each channel of each pixel in case `struct `pixel_argb_u16` >>>> had any gap added by the compiler between the struct fields. >>> >>> Oh the CRC computation. Good point. >>> >>> Can you add a comment about that with the check? >> >> Yeah, np. >> >> I will copy the explanation above :) >> >>> >>>>>> >>>>>>> + return -EINVAL; >>>>>>> + >>>>>>> + if (crtc_state->num_active_planes >= 1) { >>>>>>> + act_plane = crtc_state->active_planes[0]; >>>>>>> + if (act_plane->base.base.plane->type == DRM_PLANE_TYPE_PRIMARY) >>>>>>> + primary_plane_info = act_plane->frame_info; >>>>>> >>>>>> After the next patch, do you even need the primary plane for anything >>>>>> specifically? >>>> >>>> Yeah, I will not need it anymore. >>>> >>>>>> There is the map_is_null check below, but that should be >>>>>> done on all planes in the array, right? >>>> >>>> Yes, I guess so. And I don't know why it only checks for the >>>> primary_plane TBH. >>> >>> Maybe a left-over from times when it didn't have anything but a primary >>> plane? >> >> Maybe. >> >> Anyway, I have added this verification to all active planes in the next >> version. >> >>> >>>>>> >>>>>> I suspect the next patch, or another patch in this series, should just >>>>>> delete this chunk. >>>> I should, and I will in the V6 of next patch. >>>> >>>>> >>>>> >>>>> >>>>>> >>>>>>> } >>>>>>> + if (!primary_plane_info) >>>>>>> + return -EINVAL; >>>>>>> + >>>>>>> if (WARN_ON(dma_buf_map_is_null(&primary_plane_info->map[0]))) >>>>>>> return -EINVAL; >>>>>>> + if (WARN_ON(check_format_funcs(crtc_state, active_wb))) >>>>>>> + return -EINVAL; >>>>>>> + line_width = drm_rect_width(&primary_plane_info->dst); >>>>>>> + stage_buffer.n_pixels = line_width; >>>>>>> + output_buffer.n_pixels = line_width; >>>>>>> + stage_buffer.pixels = kvmalloc(line_width * pixel_size, GFP_KERNEL); >>>>>>> + if (!stage_buffer.pixels) { >>>>>>> + DRM_ERROR("Cannot allocate memory for the output line buffer"); >>>>>>> + return -ENOMEM; >>>>>>> + } >>>>>>> + output_buffer.pixels = kvmalloc(line_width * pixel_size, GFP_KERNEL); >>>>>>> + if (!output_buffer.pixels) { >>>>>>> + DRM_ERROR("Cannot allocate memory for intermediate line buffer"); >>>>>>> + ret = -ENOMEM; >>>>>>> + goto free_stage_buffer; >>>>>>> + } >>>>>>> + >>>>>>> + if (active_wb) { >>>>>>> + struct vkms_frame_info *wb_frame_info = &active_wb->frame_info; >>>>>>> + >>>>>>> + wb_format = wb_frame_info->fb->format->format; >>>>>> >>>>>> I don't see wb_format being used, is it? >>>>> >>>>> This is probably a leftover from the last versions. Thanks for catching >>>>> it. >>>>> >>>>>> >>>>>>> + wb_frame_info->src = primary_plane_info->src; >>>>>>> + wb_frame_info->dst = primary_plane_info->dst; >>>>>>> + } >>>>>>> + >>>>>>> + blend(active_wb, crtc_state, crc32, &stage_buffer, >>>>>>> + &output_buffer, (s64)line_width * pixel_size); >>>>>> >>>>>> What's the (s64) doing here? >>>>>> >>>>>> Are byte sizes not usually expressed with size_t or ssize_t types, or >>>>>> is the kernel convention to use u64 and s64? >>>>>> >>>>>> This makes me suspect that pixel_offset() and friends in vkms_format.c >>>>>> are going to need fixing as well. int type overflows at 2G. >>>>> >>>>> >>>>> Yeah, I should be using size_t in all these places. >>>>> >>>>>> >>>>>>> + >>>>>>> + kvfree(output_buffer.pixels); >>>>>>> +free_stage_buffer: >>>>>>> + kvfree(stage_buffer.pixels); >>>>>>> +can >>>>>>> + return ret; >>>>>>> } >>>>>>> /** >>>>>>> @@ -222,13 +204,11 @@ void vkms_composer_worker(struct work_struct *work) >>>>>>> struct vkms_crtc_state, >>>>>>> composer_work); >>>>>>> struct drm_crtc *crtc = crtc_state->base.crtc; >>>>>>> + struct vkms_writeback_job *active_wb = crtc_state->active_writeback; >>>>>>> struct vkms_output *out = drm_crtc_to_vkms_output(crtc); >>>>>>> bool crc_pending, wb_pending; >>>>>>> u64 frame_start, frame_end; >>>>>>> + u32 crc32 = 0; >>>>>>> int ret; >>>>>>> spin_lock_irq(&out->composer_lock); >>>>>>> @@ -248,35 +228,19 @@ void vkms_composer_worker(struct work_struct *work) >>>>>>> if (!crc_pending) >>>>>>> return; >>>>>>> if (wb_pending) >>>>>>> + ret = compose_active_planes(active_wb, crtc_state, &crc32); >>>>>>> + else >>>>>>> + ret = compose_active_planes(NULL, crtc_state, &crc32); >>>>>>> + if (ret) >>>>>>> return; >>>>>>> if (wb_pending) { >>>>>>> drm_writeback_signal_completion(&out->wb_connector, 0); >>>>>>> spin_lock_irq(&out->composer_lock); >>>>>>> crtc_state->wb_pending = false; >>>>>>> spin_unlock_irq(&out->composer_lock); >>>>>>> } >>>>>>> /* >>>>>>> diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c >>>>>>> new file mode 100644 >>>>>>> index 000000000000..931a61405d6a >>>>>>> --- /dev/null >>>>>>> +++ b/drivers/gpu/drm/vkms/vkms_formats.c >>>>>>> @@ -0,0 +1,151 @@ >>>>>>> +// SPDX-License-Identifier: GPL-2.0+ >>>>>>> + >>>>>>> +#include <drm/drm_rect.h> >>>>>>> +#include <linux/minmax.h> >>>>>>> + >>>>>>> +#include "vkms_formats.h" >>>>>>> + >>>>>>> +static int pixel_offset(const struct vkms_frame_info *frame_info, int x, int y) >>>>>>> +{ >>>>>>> + return frame_info->offset + (y * frame_info->pitch) >>>>>>> + + (x * frame_info->cpp); >>>>>>> +} >>>>>>> + >>>>>>> +/* >>>>>>> + * packed_pixels_addr - Get the pointer to pixel of a given pair of coordinates >>>>>>> + * >>>>>>> + * @frame_info: Buffer metadata >>>>>>> + * @x: The x(width) coordinate of the 2D buffer >>>>>>> + * @y: The y(Heigth) coordinate of the 2D buffercan >>>>>>> + * >>>>>>> + * Takes the information stored in the frame_info, a pair of coordinates, and >>>>>>> + * returns the address of the first color channel. >>>>>>> + * This function assumes the channels are packed together, i.e. a color channel >>>>>>> + * comes immediately after another in the memory. And therefore, this function >>>>>>> + * doesn't work for YUV with chroma subsampling (e.g. YUV420 and NV21). >>>>>>> + */ >>>>>>> +static void *packed_pixels_addr(const struct vkms_frame_info *frame_info, >>>>>>> + int x, int y) >>>>>>> +{ >>>>>>> + int offset = pixel_offset(frame_info, x, y); >>>>>>> + >>>>>>> + return (u8 *)frame_info->map[0].vaddr + offset; >>>>>>> +} >>>>>>> + >>>>>>> +static void *get_packed_src_addr(const struct vkms_frame_info *frame_info, int y) >>>>>>> +{ >>>>>>> + int x_src = frame_info->src.x1 >> 16; >>>>>>> + int y_src = y - frame_info->dst.y1 + (frame_info->src.y1 >> 16); >>>>>>> + >>>>>>> + return packed_pixels_addr(frame_info, x_src, y_src); >>>>>>> +} >>>>>>> + >>>>>>> +static void ARGB8888_to_argb_u16(struct line_buffer *stage_buffer, >>>>>>> + const struct vkms_frame_info *frame_info, int y) >>>>>>> +{ >>>>>>> + struct pixel_argb_u16 *out_pixels = stage_buffer->pixels; >>>>>>> + u8 *src_pixels = get_packed_src_addr(frame_info, y); >>>>>>> + int x, x_limit = min_t(size_t, drm_rect_width(&frame_info->dst), >>>>>>> + stage_buffer->n_pixels); >>>>>>> + >>>>>>> + for (x = 0; x < x_limit; x++, src_pixels += 4) { >>>>>>> + /* >>>>>>> + * The 257 is the "conversion ratio". This number is obtained by the >>>>>>> + * (2^16 - 1) / (2^8 - 1) division. Which, in this case, tries to get >>>>>>> + * the best color value in a pixel format with more possibilities. >>>>>>> + * A similar idea applies to others RGB color conversions. >>>>>>> + */ >>>>>>> + out_pixels[x].a = (u16)src_pixels[3] * 257; >>>>>>> + out_pixels[x].r = (u16)src_pixels[2] * 257; >>>>>>> + out_pixels[x].g = (u16)src_pixels[1] * 257; >>>>>>> + out_pixels[x].b = (u16)src_pixels[0] * 257; >>>>>>> + } >>>>>>> +} >>>>>>> + >>>>>>> +static void XRGB8888_to_argb_u16(struct line_buffer *stage_buffer, >>>>>>> + const struct vkms_frame_info *frame_info, int y) >>>>>>> +{ >>>>>>> + struct pixel_argb_u16 *out_pixels = stage_buffer->pixels; >>>>>>> + u8 *src_pixels = get_packed_src_addr(frame_info, y); >>>>>>> + int x, x_limit = min_t(size_t, drm_rect_width(&frame_info->dst), >>>>>>> + stage_buffer->n_pixels); >>>>>>> + >>>>>>> + for (x = 0; x < x_limit; x++, src_pixels += 4) { >>>>>>> + out_pixels[x].a = (u16)0xffff; >>>>>>> + out_pixels[x].r = (u16)src_pixels[2] * 257; >>>>>>> + out_pixels[x].g = (u16)src_pixels[1] * 257; >>>>>>> + out_pixels[x].b = (u16)src_pixels[0] * 257; >>>>>>> + } >>>>>>> +} >>>>>>> + >>>>>>> +/* >>>>>>> + * The following functions take an line of argb_u16 pixels from the >>>>>>> + * src_buffer, convert them to a specific format, and store them in the >>>>>>> + * destination. >>>>>>> + * >>>>>>> + * They are used in the `compose_active_planes` to convert and store a line >>>>>>> + * from the src_buffer to the writeback buffer. >>>>>>> + */ >>>>>>> +static void argb_u16_to_ARGB8888(struct vkms_frame_info *frame_info, >>>>>>> + const struct line_buffer *src_buffer, int y) >>>>>>> +{ >>>>>>> + int x, x_dst = frame_info->dst.x1; >>>>>>> + u8 *dst_pixels = packed_pixels_addr(frame_info, x_dst, y); >>>>>>> + struct pixel_argb_u16 *in_pixels = src_buffer->pixels; >>>>>>> + int x_limit = min_t(size_t, drm_rect_width(&frame_info->dst), >>>>>>> + src_buffer->n_pixels); >>>>>>> + >>>>>>> + for (x = 0; x < x_limit; x++, dst_pixels += 4) { >>>>>>> + /* >>>>>>> + * This sequence below is important because the format's byte order is >>>>>>> + * in little-endian. In the case of the ARGB8888 the memory is >>>>>>> + * organized this way: >>>>>>> + * >>>>>>> + * | Addr | = blue channel >>>>>>> + * | Addr + 1 | = green channel >>>>>>> + * | Addr + 2 | = Red channel >>>>>>> + * | Addr + 3 | = Alpha channel >>>>>>> + */ >>>>>>> + dst_pixels[3] = DIV_ROUND_CLOSEST(in_pixels[x].a, 257); >>>>>>> + dst_pixels[2] = DIV_ROUND_CLOSEST(in_pixels[x].r, 257); >>>>>>> + dst_pixels[1] = DIV_ROUND_CLOSEST(in_pixels[x].g, 257); >>>>>>> + dst_pixels[0] = DIV_ROUND_CLOSEST(in_pixels[x].b, 257); >>>>>>> + } >>>>>>> +} >>>>>>> + >>>>>>> +static void argb_u16_to_XRGB8888(struct vkms_frame_info *frame_info, >>>>>>> + const struct line_buffer *src_buffer, int y) >>>>>>> +{ >>>>>>> + int x, x_dst = frame_info->dst.x1; >>>>>>> + u8 *dst_pixels = packed_pixels_addr(frame_info, x_dst, y); >>>>>>> + struct pixel_argb_u16 *in_pixels = src_buffer->pixels; >>>>>>> + int x_limit = min_t(size_t, drm_rect_width(&frame_info->dst), >>>>>>> + src_buffer->n_pixels); >>>>>>> + >>>>>>> + for (x = 0; x < x_limit; x++, dst_pixels += 4) { >>>>>>> + dst_pixels[3] = (u8)0xff; >>>>>> >>>>>> When writing to XRGB, it's not necessary to ensure the X channel has >>>>>> any sensible value. Anyone reading from XRGB must ignore that value >>>>>> anyway. So why not write something wacky here, like 0xa1, that is far >>>>>> enough from both 0x00 or 0xff to not be confused with them even >>>>>> visually? Also not 0x7f or 0x80 which are close to half of 0xff. >>>>>> >>>>>> Or, you could save a whole function and just use argb_u16_to_ARGBxxxx() >>>>>> instead, even for XRGB destination. >>>>> >>>>> >>>>> Right. Maybe I could just leave the channel untouched. >>> >>> Untouched may not be a good idea. Leaving anything untouched always has >>> the risk of leaking information through uninitialized memory. Maybe not >>> in this case because the destination is allocated by userspace already, >>> but nothing beats being obviously correct. >> >> Makes sense. >> >>> >>> Whatever you decide here, be prepared for it becoming de-facto kernel >>> UABI, because it is easy for userspace to (accidentally) rely on the >>> value, no matter what you pick. >> >> I hope to make the right decision then. > >The de-facto UABI seems to be already in place for {A, X}RGB8888. "Only XRGB_8888 > >I changed from 0xff to 0xbe and the `writeback-check-output` started to fail. > >> >>> >>> >>> Thanks, >>> pq >>> >>> >>>>> >>>>>> >>>>>>> + dst_pixels[2] = DIV_ROUND_CLOSEST(in_pixels[x].r, 257); >>>>>>> + dst_pixels[1] = DIV_ROUND_CLOSEST(in_pixels[x].g, 257); >>>>>>> + dst_pixels[0] = DIV_ROUND_CLOSEST(in_pixels[x].b, 257); >>>>>>> + } >>>>>>> +} >>>>>>> + >>>>>>> +plane_format_transform_func get_plane_fmt_transform_function(u32 format) >>>>>>> +{ >>>>>>> + if (format == DRM_FORMAT_ARGB8888) >>>>>>> + return &ARGB8888_to_argb_u16; >>>>>>> + else if (format == DRM_FORMAT_XRGB8888) >>>>>>> + return &XRGB8888_to_argb_u16; >>>>>>> + else >>>>>>> + return NULL; >>>>>> >>>>>> This works for now, but when more formats are added, I'd think a switch >>>>>> statement would look better. >>>>> >>>>> ok. >>>>> >>>>>> >>>>>>> +} >>>>>>> + >>>>>>> +wb_format_transform_func get_wb_fmt_transform_function(u32 format) >>>>>>> +{ >>>>>>> + if (format == DRM_FORMAT_ARGB8888) >>>>>>> + return &argb_u16_to_ARGB8888; >>>>>>> + else if (format == DRM_FORMAT_XRGB8888) >>>>>>> + return &argb_u16_to_XRGB8888; >>>>>>> + else >>>>>>> + return NULL; >>>>>>> +}