Dne petek, 30. november 2018 ob 08:30:47 CET je Maxime Ripard napisal(a): > On Tue, Nov 27, 2018 at 05:30:00PM +0100, Jernej Škrabec wrote: > > > > > +static void _cedrus_write_ref_list(struct cedrus_ctx *ctx, > > > > > + struct cedrus_run *run, > > > > > + const u8 *ref_list, u8 num_ref, > > > > > + enum cedrus_h264_sram_off sram) > > > > > +{ > > > > > + const struct v4l2_ctrl_h264_decode_param *decode = > > > > > run->h264.decode_param; + struct vb2_queue *cap_q = > > > > > &ctx->fh.m2m_ctx->cap_q_ctx.q; > > > > > + struct cedrus_dev *dev = ctx->dev; > > > > > + u32 sram_array[CEDRUS_MAX_REF_IDX / sizeof(u32)]; > > > > > + unsigned int size, i; > > > > > + > > > > > + memset(sram_array, 0, sizeof(sram_array)); > > > > > + > > > > > + for (i = 0; i < num_ref; i += 4) { > > > > > + unsigned int j; > > > > > + > > > > > + for (j = 0; j < 4; j++) { > > > > > > > > I don't think you have to complicate with two loops here. > > > > cedrus_h264_write_sram() takes void* and it aligns to 4 anyway. So as > > > > long > > > > input buffer is multiple of 4 (u8[CEDRUS_MAX_REF_IDX] qualifies for > > > > that), > > > > you can use single for loop with "u8 sram_array[CEDRUS_MAX_REF_IDX]". > > > > This should make code much more readable. > > > > > > This wasn't really about the alignment, but in order to get the > > > offsets in the u32 and the array more easily. > > > > > > Breaking out the loop will make that computation less easy on the eye, > > > so I guess it's very subjective. > > > > For some strange reason, code below fixes decoding issue from one of my > > test > > samples. This is what I actually meant with 1 loop approach: > Do you have that test sample somewhere accessible? yes, it's here: http://jernej.libreelec.tv/videos/h264/Star%20Wars%20Episode%20VII%20-%20The%20Force%20Awakens%20-%20Teaser%20Trailer%202.mp4 It needs also prediction weight tables (your early patch for that should work ok) and scaling list (code I sent you in one of the previous comments should work). For me, if this sample worked without issue, every other non-interlaced sample worked too. > > > static void _cedrus_write_ref_list(struct cedrus_ctx *ctx, > > > > struct cedrus_run *run, > > const u8 *ref_list, u8 num_ref, > > enum cedrus_h264_sram_off sram) > > > > { > > > > const struct v4l2_ctrl_h264_decode_param *decode = > > run->h264.decode_param; > > struct vb2_queue *cap_q = &ctx->fh.m2m_ctx->cap_q_ctx.q; > > struct cedrus_dev *dev = ctx->dev; > > u8 sram_array[CEDRUS_MAX_REF_IDX]; > > unsigned int i; > > > > memset(sram_array, 0, sizeof(sram_array)); > > num_ref = min(num_ref, (u8)CEDRUS_MAX_REF_IDX); > > > > for (i = 0; i < num_ref; i++) { > > > > const struct v4l2_h264_dpb_entry *dpb; > > const struct cedrus_buffer *cedrus_buf; > > const struct vb2_v4l2_buffer *ref_buf; > > unsigned int position; > > int buf_idx; > > u8 dpb_idx; > > > > dpb_idx = ref_list[i]; > > dpb = &decode->dpb[dpb_idx]; > > > > if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)) > > > > continue; > > > > buf_idx = vb2_find_tag(cap_q, dpb->tag, 0); > > if (buf_idx < 0) > > > > continue; > > > > ref_buf = to_vb2_v4l2_buffer(ctx->dst_bufs[buf_idx]); > > cedrus_buf = vb2_v4l2_to_cedrus_buffer(ref_buf); > > position = cedrus_buf->codec.h264.position; > > > > sram_array[i] |= position << 1; > > if (ref_buf->field == V4L2_FIELD_BOTTOM) > > > > sram_array[i] |= BIT(0); > > > > } > > > > cedrus_h264_write_sram(dev, sram, &sram_array, num_ref); > > > > } > > > > IMO this code is easier to read. > > INdeed, thanks! > > > > > > + const struct v4l2_h264_dpb_entry *dpb; > > > > > + const struct cedrus_buffer *cedrus_buf; > > > > > + const struct vb2_v4l2_buffer *ref_buf; > > > > > + unsigned int position; > > > > > + int buf_idx; > > > > > + u8 ref_idx = i + j; > > > > > + u8 dpb_idx; > > > > > + > > > > > + if (ref_idx >= num_ref) > > > > > + break; > > > > > + > > > > > + dpb_idx = ref_list[ref_idx]; > > > > > + dpb = &decode->dpb[dpb_idx]; > > > > > + > > > > > + if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)) > > > > > + continue; > > > > > + > > > > > + buf_idx = vb2_find_tag(cap_q, dpb->tag, 0); > > > > > + if (buf_idx < 0) > > > > > + continue; > > > > > + > > > > > + ref_buf = to_vb2_v4l2_buffer(ctx->dst_bufs[buf_idx]); > > > > > + cedrus_buf = vb2_v4l2_to_cedrus_buffer(ref_buf); > > > > > + position = cedrus_buf->codec.h264.position; > > > > > + > > > > > + sram_array[i] |= position << (j * 8 + 1); > > > > > + if (ref_buf->field == V4L2_FIELD_BOTTOM) > > > > > > > > You newer set above flag to buffer so this will be always false. > > > > > > As far as I know, the field is supposed to be set by the userspace. > > > > How? I thought that only flags at queueing buffers can be set and there is > > no bottom/top flag. > > https://linuxtv.org/downloads/v4l-dvb-apis/uapi/v4l/buffer.html#c.v4l2_buffe > r > > "Indicates the field order of the image in the buffer, see > v4l2_field. This field is not used when the buffer contains VBI > data. Drivers must set it when type refers to a capture stream, > applications when it refers to an output stream." > > My understanding is that the application should set it, since we'll > use the output stream's buffer here. But I might very well be wrong > about it :/ I'll take a look, thanks. Best regards, Jernej