Re: [PATCH v2 2/7] drm/vkms: Add support for multy-planar framebuffers

Arthur Grillo <arthurgrillo@xxxxxxxxxx> · Fri, 2 Feb 2024 15:49:42 -0300

On 01/02/24 14:38, Louis Chauvet wrote:
>>  
>>  /*
>> @@ -23,27 +23,25 @@ static size_t pixel_offset(const struct vkms_frame_info *frame_info, int x, int
>>   * @frame_info: Buffer metadata
>>   * @x: The x(width) coordinate of the 2D buffer
>>   * @y: The y(Heigth) coordinate of the 2D buffer
>> + * @index: The index of the plane on the 2D buffer
>>   *
>>   * Takes the information stored in the frame_info, a pair of coordinates, and
>> - * returns the address of the first color channel.
>> - * This function assumes the channels are packed together, i.e. a color channel
>> - * comes immediately after another in the memory. And therefore, this function
>> - * doesn't work for YUV with chroma subsampling (e.g. YUV420 and NV21).
>> + * returns the address of the first color channel on the desired index.
>>   */
>>  static void *packed_pixels_addr(const struct vkms_frame_info *frame_info,
>> -				int x, int y)
>> +				int x, int y, size_t index)
>>  {
>> -	size_t offset = pixel_offset(frame_info, x, y);
>> +	size_t offset = pixel_offset(frame_info, x, y, index);
>>  
>>  	return (u8 *)frame_info->map[0].vaddr + offset;
>>  }
> 
> This implementation of packed_pixels_addr will only work with
> block_w == block_h == 1. For packed or tiled formats we will need to use
> x/y information to extract the correct address, and this address will not 
> be a single pixel. See below my explanation.

You're right, currently, VKMS only supports non-packed/tiled formats. As
all the formats I plan to add are too not packed or tiled, I haven't
added support to it. But if you want to add it, please do :).

>> @@ -130,17 +128,28 @@ void vkms_compose_row(struct line_buffer *stage_buffer, struct vkms_plane_state
>>  {
>>  	struct pixel_argb_u16 *out_pixels = stage_buffer->pixels;
>>  	struct vkms_frame_info *frame_info = plane->frame_info;
>> -	u8 *src_pixels = get_packed_src_addr(frame_info, y);
>> +	const struct drm_format_info *frame_format = frame_info->fb->format;
>>  	int limit = min_t(size_t, drm_rect_width(&frame_info->dst), stage_buffer->n_pixels);
>> +	u8 *src_pixels[DRM_FORMAT_MAX_PLANES];
>>  
>> -	for (size_t x = 0; x < limit; x++, src_pixels += frame_info->fb->format->cpp[0]) {
>> +	for (size_t i = 0; i < frame_format->num_planes; i++)
>> +		src_pixels[i] = get_packed_src_addr(frame_info, y, i);
>> +
>> +	for (size_t x = 0; x < limit; x++) {
>>  		int x_pos = get_x_position(frame_info, limit, x);
>>  
>> -		if (drm_rotation_90_or_270(frame_info->rotation))
>> -			src_pixels = get_packed_src_addr(frame_info, x + frame_info->rotated.y1)
>> -				+ frame_info->fb->format->cpp[0] * y;
>> +		if (drm_rotation_90_or_270(frame_info->rotation)) {
>> +			for (size_t i = 0; i < frame_format->num_planes; i++) {
>> +				src_pixels[i] = get_packed_src_addr(frame_info,
>> +								    x + frame_info->rotated.y1, i);
>> +				src_pixels[i] += frame_format->cpp[i] * y;
> 
> I find the current rotation management a bit complex to understand. This 
> is not related to your patch, but as I had to understand this to create my 
> second patch, I think this could be significanlty simplified.

I also found the rotation logic complex when implementing this. I would
appreciate it if it were simplified.

> 
> Please see the below comment about frame_format->cpp, it applies here too. 
> I think the "easy" way here is simply to reuse the method 
> get_packed_src_addr every time you need a new pixel.
> 
>> +			}
>> +		}
>>  
>> 		plane->pixel_read(src_pixels, &out_pixels[x_pos]);
>> +
> 
> The usage of cpp and pointer to specific pixel only work for non-packed 
> and non-blocked pixels, but for example NV30 or Y0L0 need more 
> informations about the exact location of the pixel to convert and write 
> the correct pixel value (each pixel can't be referenced directly by a 
> pointer). For example NV30 uses 5 bytes to store 3 pixels (10 bits each), 
> so to access the "middle" one you need to read the 5 bytes and do a small 
> computation to extract it's value.

Great explanation, I can see what is the problem here.

> 
> I think a simple solution to handle most cases would be to profide two 
> more parameters: the x and y positions of the pixel to copy, using 
> "absolute coordinates" (i.e x=0,y=0 means the first byte of the src 
> buffer, not the first pixel in the `drm_rect src`, this way the method 
> `pixel_read` can extract the correct value).
> 
> This way it become easy to manage "complex" pixel representations in this 
> loop: simply increment x/y and let the pixel_read method handle 
> everything.
> 
> The second patch I will send is doing this. And as explained before, it 
> will also simplify a lot the code related to rotation and translation (no 
> more switch case everywhere to add offset to x/y, it simply use drm_rect_* 
> helpers).

I like this, expect my review soon :).

> 
> It's not optimal in term of performance (in some situation it will read 
> the same block multiple time to generate different pixels), but I 
> believe it still is an intersting trade-off.
> 
> In the future, if performance is actally critical, the whole composition 
> loop will have to be specialized for each pixel formats: some can be 
> treated line by line (as it's done today), but with blocks or packed 
> pixels it's more complex.
> 
>> +		for (size_t i = 0; i < frame_format->num_planes; i++)
>> +			src_pixels[i] += frame_format->cpp[i];
> 
> This is likely working with format with block_w != 1, see explanation 
> above.

I think you meant that is _not_ working. Yeah, as I already explained,
it was never my plan to add support for packed or tiled formats.

Best Regards,
~Arthur Grillo