Media Summit: Codecs October 29, 2019 - Lyon, France Many thanks to the Linux Foundation for hosting this meeting. Much appreciated! Please reply to this report with any comments/corrections. Especially if I missed any action items! Original announcement: https://lore.kernel.org/linux-media/c8380b43-2742-f1cb-0fb9-2c3c90e29a33@xxxxxxxxx/T/ Attendees: Boris Brezillon <boris.brezillon@xxxxxxxxxxxxx> Alexandre Courbot <acourbot@xxxxxxxxxxxx>, Google Chrome OS Nicolas Dufresne <nicolas@xxxxxxxxxxxx> Tomasz Figa <tfiga@xxxxxxxxxxxx>, Google Chrome OS Ezequiel Garcia <ezequiel.garcia@xxxxxxxxxxxxx> Daniel Gomez <daniel@xxxxxxxx> Peter Griffin <Peter.griffin@xxxxxxxxxx> Dafna Hirschfeld <dafna.hirschfeld@xxxxxxxxxxxxx> Paul Kocialkowski <paul.kocialkowski@xxxxxxxxxxx>, Bootlin Helen Koike <helen.koike@xxxxxxxxxxxxx> Jan Schmidt <jan@xxxxxxxxxxxxxxx> Dave Stevenson <dave.stevenson@xxxxxxxxxxxxxxx> Michael Tretter <m.tretter@xxxxxxxxxxxxxx> Stanimir Varbanov <stanimir.varbanov@xxxxxxxxxx> Hans Verkuil <hverkuil@xxxxxxxxx>, Cisco Systems Norway Matthew Waters <matthew@xxxxxxxxxxxxxxx> Discussion of pending codec patches ----------------------------------- A v3 series for v5.4 hantro fixes will be posted soon. Requirements for moving drivers out of staging ---------------------------------------------- - More testing is needed for these two H.264 and HEVC features: Multiview (used by 3D video, hantro supports this) Stream with sublayers - Improve standard references in the control metadata documentation. - Look into the ability to add fields to a compound control in the future: requires some investigation into the control framework. Helps to make the API more future proof. - Document that metadata controls must stick to the standard as much as possible. No hardware specific data is allowed. - At least one stateless encoder should be present, ideally for each codec. This is needed to see if the existing metadata controls can be reused by encoders. Check the Intel encoder and VA API. Also check if there is any source code for the AMD encoder. For reference: this is the Rockchip H.264 stateless encoder metadata: struct rk3288_h264e_reg_params { u32 frame_coding_type; s32 pic_init_qp; s32 slice_alpha_offset; s32 slice_beta_offset; s32 chroma_qp_index_offset; s32 filter_disable; u16 idr_pic_id; s32 pps_id; s32 frame_num; s32 slice_size_mb_rows; s32 h264_inter4x4_disabled; s32 enable_cabac; s32 transform8x8_mode; s32 cabac_init_idc; /* rate control relevant */ s32 qp; s32 mad_qp_delta; s32 mad_threshold; s32 qp_min; s32 qp_max; s32 cp_distance_mbs; s32 cp_target[10]; s32 target_error[7]; s32 delta_qp[7]; }; And this is the VP8 metadata: struct rk3399_vp8e_reg_params { u32 is_intra; u32 frm_hdr_size; u32 qp; s32 mv_prob[2][19]; s32 intra_prob; u32 bool_enc_value; u32 bool_enc_value_bits; u32 bool_enc_range; u32 filterDisable; u32 filter_sharpness; u32 filter_level; s32 intra_frm_delta; s32 last_frm_delta; s32 golden_frm_delta; s32 altref_frm_delta; s32 bpred_mode_delta; s32 zero_mode_delta; s32 newmv_mode_delta; s32 splitmv_mode_delta; }; Source: https://chromium.googlesource.com/chromiumos/third_party/libv4lplugins/+/79286ece8624ab016575a5ad8965a61b334ab169/libv4l-rockchip_v2/libvepu/common/rk_venc.h And the metadata for cedrus: struct h264enc_params { unsigned int width; unsigned int height; unsigned int src_width; unsigned int src_height; enum color_format { H264_FMT_NV12 = 0, H264_FMT_NV16 = 1 } src_format; unsigned int profile_idc, level_idc; enum { H264_EC_CAVLC = 0, H264_EC_CABAC = 1 } entropy_coding_mode; unsigned int qp; unsigned int keyframe_interval; }; Source: https://github.com/jemk/cedrus/blob/master/h264enc/h264enc.c Cisco also has a requirement that the bitrate can be controlled per-frame. Conclusion: stateless encoder support needs some research. However, the general suspicion is that the decoder metadata controls are unlikely to be reused for stateless encoders. Finalize Stateful Encoder ------------------------- Currently S_PARM/ENUM_FRAMEINTERVALS is used to set the framerate which is needed by encoders together with the desired bitrate to determine the compression ratio. After some discussion we realized that this should actually refer to the rate at which the encoder produces compressed frames: this is needed when you want to encoder multiple streams in parallel and you want to indicate how the encoder hardware should balance these encoder processes. E.g. the Xilinx encoder can reserve N encoding cores depending on the demand. Setting the actual framerate (which is needed to determine the compression ratio) is separate from this and should be done through a new control of type v4l2_fract that indicates the framerate (not interval, the userspace people very much preferred framerate over frameinterval). This requires the introduction of V4L2_CTRL_WHICH_{MIN,MAX}_VAL to obtain the min and max values of a compound control. But this tries in nicely with the work Ricardo Ribalda Delgado is doing for the V4L2_CTRL_TYPE_AREA controls. Hans will implement this in the control framework, Michael Tretter will add support for this in his Xilinx driver. The other outstanding issue with stateful encoders is what to do if the capture buffer is too small. It turns out that it is next to impossible to precisely predict the minimum size of a capture buffers, so some mechanism is needed to handle this corner case. We agreed that the best way is to mark the capture buffer that's too small with V4L2_BUF_FLAG_ERROR and indicate that the reason is that it is too small with a new buffer flag: V4L2_BUF_FLAG_TOO_SMALL (0x00080000). When userspace sees this it should stop streaming on the capture side, reallocate and requeue capture buffers, and restart streaming. This will work fine if there are no B frames. If the encoder produces B frames as well, then this approach can produce an invalid stream. The only way this can be resolved is if the HW/FW can rollback its internal state to before the point this error was detected. In the future we need a pixelformat flag to indicate that the HW/FW can rollback. If the HW can fragment the encoded frames over multiple capture buffers, then it should do so. The driver should set V4L2_FMT_FLAG_CONTINUOUS_BYTESTREAM for this. However, this feature should probably be expliticly requested. This can be done through a new V4L2_PIX_FMT_FLAG_ flag. Some more discussion is needed for this. Nicolas mentioned that some codec drivers used to return wrong values for ENUM_FRAMEINTERVALS (swapped numerator/denominator): v4l2-compliance should check if the returned values are sane. Nicolas also mentioned that it is not clear how drivers should round S_FMT resolutions for codecs: this is currently driver specific. He would like this to be documented (and checked) as rounding up. Touched upon but not really discussed in-depth: SVC (Scalable Video Coding) support. Action Items ------------ Hans Verkuil: - Ask Cisco colleagues which bitrate-related parameters have to be per-frame for an encoder - make stateful encoder infrastructure + documentation for the missing bits - investigate using different sizes for metadata controls in the control framework: is this possible? Michael Tretter: - Support the new encoder stateful controls in the driver Tomasz Figa: - look up AMD encoder support Boris Brezillon: - send v3 of hantro g1 fixes Nicolas Dufresne: - look into multiview and sublayers support Paul Kocialkowski: - check metadata controls against the standards and update the docs if needed Ezequiel Garcia and Boris Brezillon: - add VP9 support