On Mon, 13 May 2024 09:15:13 +0200 Louis Chauvet <louis.chauvet@xxxxxxxxxxx> wrote: > Le 22/04/24 - 14:07, Pekka Paalanen a écrit : > > On Tue, 09 Apr 2024 15:25:25 +0200 > > Louis Chauvet <louis.chauvet@xxxxxxxxxxx> wrote: > > > > > Introduce the usage of block_h/block_w to compute the offset and the > > > pointer of a pixel. The previous implementation was specialized for > > > planes with block_h == block_w == 1. To avoid confusion and allow easier > > > implementation of tiled formats. It also remove the usage of the > > > deprecated format field `cpp`. > > > > > > Introduce the plane_index parameter to get an offset/pointer on a > > > different plane. > > > > > > Signed-off-by: Louis Chauvet <louis.chauvet@xxxxxxxxxxx> > > > --- > > > drivers/gpu/drm/vkms/vkms_formats.c | 110 ++++++++++++++++++++++++++++-------- > > > 1 file changed, 87 insertions(+), 23 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c > > > index 69cf9733fec5..9a1400ad4db6 100644 > > > --- a/drivers/gpu/drm/vkms/vkms_formats.c > > > +++ b/drivers/gpu/drm/vkms/vkms_formats.c > > > @@ -10,22 +10,43 @@ > > > #include "vkms_formats.h" > > > > > > /** > > > - * pixel_offset() - Get the offset of the pixel at coordinates x/y in the first plane > > > + * packed_pixels_offset() - Get the offset of the block containing the pixel at coordinates x/y > > > * > > > * @frame_info: Buffer metadata > > > * @x: The x coordinate of the wanted pixel in the buffer > > > * @y: The y coordinate of the wanted pixel in the buffer > > > + * @plane_index: The index of the plane to use > > > + * @offset: The returned offset inside the buffer of the block > > > + * @rem_x,@rem_y: The returned coordinate of the requested pixel in the block > > > * > > > - * The caller must ensure that the framebuffer associated with this request uses a pixel format > > > - * where block_h == block_w == 1. > > > - * If this requirement is not fulfilled, the resulting offset can point to an other pixel or > > > - * outside of the buffer. > > > + * As some pixel formats store multiple pixels in a block (DRM_FORMAT_R* for example), some > > > + * pixels are not individually addressable. This function return 3 values: the offset of the > > > + * whole block, and the coordinate of the requested pixel inside this block. > > > + * For example, if the format is DRM_FORMAT_R1 and the requested coordinate is 13,5, the offset > > > + * will point to the byte 5*pitches + 13/8 (second byte of the 5th line), and the rem_x/rem_y > > > + * coordinates will be (13 % 8, 5 % 1) = (5, 0) > > > + * > > > + * With this function, the caller just have to extract the correct pixel from the block. > > > */ > > > -static size_t pixel_offset(const struct vkms_frame_info *frame_info, int x, int y) > > > +static void packed_pixels_offset(const struct vkms_frame_info *frame_info, int x, int y, > > > + int plane_index, int *offset, int *rem_x, int *rem_y) > > > { > > > struct drm_framebuffer *fb = frame_info->fb; > > > + const struct drm_format_info *format = frame_info->fb->format; > > > + /* Directly using x and y to multiply pitches and format->ccp is not sufficient because > > > + * in some formats a block can represent multiple pixels. > > > + * > > > + * Dividing x and y by the block size allows to extract the correct offset of the block > > > + * containing the pixel. > > > + */ > > > > > > - return fb->offsets[0] + (y * fb->pitches[0]) + (x * fb->format->cpp[0]); > > > + int block_x = x / drm_format_info_block_width(format, plane_index); > > > + int block_y = y / drm_format_info_block_height(format, plane_index); > > > + *rem_x = x % drm_format_info_block_width(format, plane_index); > > > + *rem_y = y % drm_format_info_block_height(format, plane_index); > > > + *offset = fb->offsets[plane_index] + > > > + block_y * fb->pitches[plane_index] + > > > + block_x * format->char_per_block[plane_index]; > > > > I started thinking... is > > > > + block_y * fb->pitches[plane_index] + > > > > correct, or should it be > > > > + y * fb->pitches[plane_index] + > > > > ? > > The documentation is not very clear about that: > > * @pitches: Line stride per buffer. For userspace created object this > * is copied from drm_mode_fb_cmd2. > > If I look at the drm_mode_fb_cmd2, there is this documentation: > > /** @pitches: Pitch (aka. stride) in bytes, one per plane. */ > > For me, I interpret "stride" as it is used in matrix calculation, where it > means "the number of bytes between two number adjacent verticaly" > (&matrix[x,y] + stride == &matrix[x,y+1]). > > So in a graphic context, I interpret a stride as the number of byte to > reach the next line of blocks (as pixels can not always be accessed > individually). > > So, for me, buffer_size_in_byte >= stride * number_of_lines. This is the definition, yes. Even for blocky formats, it is still number of 1-pixel-high lines, even though one cannot address a line as such. For blocky formats it is a theoretical value, and computing with it only makes sense when your number of lines is a multiple of block height. That's my recollection. This has been hashed in issues like https://gitlab.freedesktop.org/wayland/weston/-/issues/896 > > I'm looking at drm_format_info_min_pitch() which sounds like it should > > be the latter? Because of > > > > return DIV_ROUND_UP_ULL((u64)buffer_width * info->char_per_block[plane], > > drm_format_info_block_width(info, plane) * > > drm_format_info_block_height(info, plane)); > > > > in drm_format_info_min_pitch(). > > This function doesn't make sense to me. I can't understand how it could > work. > > If I consider the X0L2 format, with block_h == block_w == 2, > char_per_block = 8, and a framebuffer of 1 * 10 pixels, the result of > drm_format_info_min_pitch is 2. If buffer_width is 1, then buffer_width / block_w is 1 because of rounding up. Cannot have images with partial blocks. That is a block-row of one block. That block takes char_per_block bytes, that is, 8 bytes here. But block height is 2, and stride is computed for 1-pixel-high line, so we divide by block_h, and the result is stride = 4 bytes. However, 1 * 8 / (2 * 2) = 2. I think the code is bugged, and the round-up happens at a wrong point. The correct form would be div_round_up(div_round_up(buffer_width, block_w) * char_per_block, block_h) in my opinion. There must always be an integer number of blocks on a block-row. If a block-row has multiple blocks, doing a div_round_up(char_per_block, block_h) would over-estimate the per-block bytes and add the extra for each block rather than averaging it out. So, multiplying by char_per_block before dividing by block_h gives a stricter minimum stride. The condition buffer_size_bytes >= buffer_height * stride is necessary but not always sufficient, because the buffer must hold an integer number of block-rows. > However, for this format and this framebuffer, I think the stride should > be at least 8 bytes (the buffer is "1 block width"). I believe the stride is 4 bytes, because stride for a 1-pixel-high line on average, rounded up. Stride allows for unused blocks per block-row, but its use for buffer size checking with buffer_height is incomplete, I believe. For the question "is buffer_size enough?" it may produce false-positives, but it cannot produce false-negatives. Thanks, pq > If pitch equals 2 (as suggested by the test), it implies that > height * pitch is not valid for calculating the minimum size of the buffer > (in our case, 10 * 2 = 20 bytes, but the minimum framebuffer size should > be 5 blocks * 8 bytes_per_block = 40 bytes). And I don't understand what > the 2 represents in this context. > Is it the number of pixels on a line (which is strange, because pitch > should be in byte)? The width in byte of the first line, but by using the > "byte per pixel" value (which make no sense when using block formats)? > > If pitch equals 8 (as I would expect), height * pitch is not optimal (but > at least sufficient to contain the entire framebuffer), and height * pitch > / block_h is optimal (which is logical, because height/block_h is the > number of block per columns). > > > Btw. maybe this should check that the result is not negative (e.g. due > > to overflow)? Or does that even work since signed overflow is undefined > > behavior (UB) and compilers may assume UB does not happen, causing the > > check to be eliminated as dead code? > > > > Otherwise this patch looks ok to me. > > > > > > Thanks, > > pq > > > > [...] > > -- > Louis Chauvet, Bootlin > Embedded Linux and Kernel engineering > https://bootlin.com
Attachment:
pgprI9dvytCWZ.pgp
Description: OpenPGP digital signature