On Tue, 15 Feb 2022 21:27:30 +0100, you wrote: >Dne torek, 15. februar 2022 ob 17:31:28 CET je John Cox napisal(a): >> On Tue, 15 Feb 2022 17:11:12 +0100, you wrote: >> >> >Dne torek, 15. februar 2022 ob 17:00:33 CET je John Cox napisal(a): >> >> On Tue, 15 Feb 2022 10:28:55 -0500, you wrote: >> >> >> >> >Le mardi 15 février 2022 à 14:50 +0000, John Cox a écrit : >> >> >> On Tue, 15 Feb 2022 15:35:12 +0100, you wrote: >> >> >> >> >> >> > >> >> >> > Le 15/02/2022 à 15:17, John Cox a écrit : >> >> >> > > Hi >> >> >> > > >> >> >> > > > The bit size of the slice could be deduced from the buffer >payload >> >> >> > > > so remove bit_size field to avoid duplicated the information. >> >> >> > > I think this is a bad idea. In the future we are (I hope) going to >> >want >> >> >> > > to have an array (variable) of slice headers all referring to the >> >same >> >> >> > > bit buffer. When we do that we will need this field. >> >> >> > >> >> >> > I wonder if that could be considering like another decode mode and >so >> >> >> > use an other control ? >> >> >> >> >> >> I, personally, would be in favour of making the slice header control a >> >> >> variable array just as it is. If userland can't cope with multiple >> >> >> entries then just send them one at a time and the code looks exactly >> >> >> like it does at the moment and if the driver can't then set max array >> >> >> entries to 1. >> >> >> >> >> >> Having implemented this in rpi port of ffmpeg and the RPi V4L2 driver I >> >> >> can say with experience that the code and effort overhead is very low. >> >> >> >> >> >> Either way having a multiple slice header control in the UAPI is >> >> >> important for efficiency. >> >> > >> >> >Just to clarify the idea, we would have a single slice controls, always >> >dynamic: >> >> > >> >> >1. For sliced based decoder >> >> > >> >> >The dynamic array slice control is implemented by the driver and its >size >> >must >> >> >be 1. >> >> >> >> Yes >> >> >> >> >2. For frame based decoder that don't care for slices >> >> > >> >> >The dynamic array slice controls is not implement. Userland detects that >at >> >> >runtime, similar to the VP9 compressed headers. >> >> >> >> If the driver parses all the slice header then that seems plausible >> >> >> >> >3. For frame based decoders that needs slices (or driver that supports >> >offset >> >> >and can gain performance with such mode) >> >> > >> >> >The dynamic array slice controls is implemented, and should contain all >the >> >> >slices found in the OUTPUT buffer. >> >> > >> >> >So the reason for this bit_size (not sure why its bits though, perhaps >> >someone >> >> >can educate me ?) >> >> >> >> RPi doesn't need bits and would be happy with bytes however >> >> slice_segment data isn't byte aligned at the end so its possible that >> >> there might be decoders out there that want an accurate length for that. >> > >> >There are two fields, please don't mix them up: >> > >> >__u32 bit_size; >> >__u32 data_bit_offset; (changed to data_byte_offset in this series) >> > >> >data_bit_offset/data_byte_offset is useful, while bit_size is IMO not. If you >> >have multiple slices in array, you only need to know start of the slice >data >> >and that offset is always offset from start of the buffer (absolute, it's not >> >relative to previous slice data). >> >> No... or at least I think not. RPi needs the start and end of the >> slice_segment_data elements of each slice. > >It would be good to know if size needs to be exact or can overshoot, like >using end of buffer for that. > >Cedrus also wants to know slice data size, but it turns out that bigger than >necessary size doesn't pose any problems. If that's not the case, then >bit_size needs stay in for sure. RPi needs the exact size (bytes will do - but I don't see the harm in specifying it in bits in case some future h/w wants the extra precision as long as we nail down exactly which bit we mean) Regards JC >Best regards, >Jernej > >> If slices are arranged in the >> buffer with slice_segment_headers attached then I don't see how I get to >> know that. Also if the OUTPUT buffer is just a bit of bitstream, which >> might well be very convienient for some userspace, then it is legitimate >> to have SEIs between slice headers so you can't even guarantee that your >> coded slice segments are contiguous. >> >> Regards >> >> JC >> >> >Best regards, >> >Jernej >> > >> >> >> >> > Would be to let the driver offset inside the the single >> >> >OUTPUT/bitstream buffer in case this is not automatically found by the >> >driver >> >> >(or that no start-code is needed). Is that last bit correct ? If so, >should >> >we >> >> >change it to an offset rather then a size ? Shall we allow using offesets >> >inside >> >> >larger buffer (e.g. to avoid some memory copies) for the Sliced Base >cases ? >> >> >> >> I use (in the current structure) data_bit_offset to find the start of >> >> each slice's slice_segment_data within the OUTPUT buffer and bit_size to >> >> find the end. RPi doesn't / can't parse the slice_header and so wants >> >> all of that. Decoders that do parse the header might plausably want >> >> header offsets too and it would facilitate zero copy of the bit buffer. >> >> >> >> >> >> >> Regards >> >> >> >> >> >> John Cox >> >> >> >> >> >> > > > Signed-off-by: Benjamin Gaignard ><benjamin.gaignard@xxxxxxxxxxxxx> >> >> >> > > > --- >> >> >> > > > .../userspace-api/media/v4l/ext-ctrls-codec.rst | 3 --- >> >> >> > > > drivers/staging/media/sunxi/cedrus/cedrus_h265.c | 11 +++ >> >+------- >> >> >> > > > include/uapi/linux/v4l2-controls.h | 3 +-- >> >> >> > > > 3 files changed, 5 insertions(+), 12 deletions(-) >> >> >> > > > >> >> >> > > > diff --git a/Documentation/userspace-api/media/v4l/ext-ctrls- >> >codec.rst b/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst >> >> >> > > > index 3296ac3b9fca..c3ae97657fa7 100644 >> >> >> > > > --- a/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst >> >> >> > > > +++ b/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst >> >> >> > > > @@ -2965,9 +2965,6 @@ enum >v4l2_mpeg_video_hevc_size_of_length_field >> >- >> >> >> > > > :stub-columns: 0 >> >> >> > > > :widths: 1 1 2 >> >> >> > > > >> >> >> > > > - * - __u32 >> >> >> > > > - - ``bit_size`` >> >> >> > > > - - Size (in bits) of the current slice data. >> >> >> > > > * - __u32 >> >> >> > > > - ``data_bit_offset`` >> >> >> > > > - Offset (in bits) to the video data in the current slice >> >data. >> >> >> > > > diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c b/ >> >drivers/staging/media/sunxi/cedrus/cedrus_h265.c >> >> >> > > > index 8ab2d9c6f048..db8c7475eeb8 100644 >> >> >> > > > --- a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c >> >> >> > > > +++ b/drivers/staging/media/sunxi/cedrus/cedrus_h265.c >> >> >> > > > @@ -312,8 +312,8 @@ static void cedrus_h265_setup(struct >cedrus_ctx >> >*ctx, >> >> >> > > > const struct v4l2_hevc_pred_weight_table >*pred_weight_table; >> >> >> > > > unsigned int width_in_ctb_luma, ctb_size_luma; >> >> >> > > > unsigned int log2_max_luma_coding_block_size; >> >> >> > > > + size_t slice_bytes; >> >> >> > > > dma_addr_t src_buf_addr; >> >> >> > > > - dma_addr_t src_buf_end_addr; >> >> >> > > > u32 chroma_log2_weight_denom; >> >> >> > > > u32 output_pic_list_index; >> >> >> > > > u32 pic_order_cnt[2]; >> >> >> > > > @@ -370,8 +370,8 @@ static void cedrus_h265_setup(struct >cedrus_ctx >> >*ctx, >> >> >> > > > >> >> >> > > > cedrus_write(dev, VE_DEC_H265_BITS_OFFSET, 0); >> >> >> > > > >> >> >> > > > - reg = slice_params->bit_size; >> >> >> > > > - cedrus_write(dev, VE_DEC_H265_BITS_LEN, reg); >> >> >> > > > + slice_bytes = vb2_get_plane_payload(&run->src- >>vb2_buf, 0); >> >> >> > > > + cedrus_write(dev, VE_DEC_H265_BITS_LEN, slice_bytes); >> >> >> > > I think one of these must be wrong. bit_size is in bits, >> >> >> > > vb2_get_plane_payload is in bytes? >> >> >> > >> >> >> > You are right it should be vb2_get_plane_payload() * 8 to get the >size >> >in bits. >> >> >> > >> >> >> > I will change that in v3. >> >> >> > >> >> >> > > >> >> >> > > Regards >> >> >> > > >> >> >> > > John Cox >> >> >> > > >> >> >> > > > /* Source beginning and end addresses. */ >> >> >> > > > >> >> >> > > > @@ -384,10 +384,7 @@ static void cedrus_h265_setup(struct >> >cedrus_ctx *ctx, >> >> >> > > > >> >> >> > > > cedrus_write(dev, VE_DEC_H265_BITS_ADDR, reg); >> >> >> > > > >> >> >> > > > - src_buf_end_addr = src_buf_addr + >> >> >> > > > - DIV_ROUND_UP(slice_params- >>bit_size, >> >8); >> >> >> > > > - >> >> >> > > > - reg = >VE_DEC_H265_BITS_END_ADDR_BASE(src_buf_end_addr); >> >> >> > > > + reg = VE_DEC_H265_BITS_END_ADDR_BASE(src_buf_addr + >slice_bytes); >> >> >> > > > cedrus_write(dev, VE_DEC_H265_BITS_END_ADDR, reg); >> >> >> > > > >> >> >> > > > /* Coding tree block address */ >> >> >> > > > diff --git a/include/uapi/linux/v4l2-controls.h b/include/uapi/ >> >linux/v4l2-controls.h >> >> >> > > > index b1a3dc05f02f..27f5d272dc43 100644 >> >> >> > > > --- a/include/uapi/linux/v4l2-controls.h >> >> >> > > > +++ b/include/uapi/linux/v4l2-controls.h >> >> >> > > > @@ -2457,7 +2457,6 @@ struct v4l2_hevc_pred_weight_table { >> >> >> > > > #define V4L2_HEVC_SLICE_PARAMS_FLAG_DEPENDENT_SLICE_SEGMENT >> >(1ULL << 9) >> >> >> > > > >> >> >> > > > struct v4l2_ctrl_hevc_slice_params { >> >> >> > > > - __u32 bit_size; >> >> >> > > > __u32 data_bit_offset; >> >> >> > > > >> >> >> > > > /* ISO/IEC 23008-2, ITU-T Rec. H.265: NAL unit header >*/ >> >> >> > > > @@ -2484,7 +2483,7 @@ struct v4l2_ctrl_hevc_slice_params { >> >> >> > > > /* ISO/IEC 23008-2, ITU-T Rec. H.265: Picture timing >SEI message >> >*/ >> >> >> > > > __u8 pic_struct; >> >> >> > > > >> >> >> > > > - __u8 reserved; >> >> >> > > > + __u8 reserved[5]; >> >> >> > > > >> >> >> > > > /* ISO/IEC 23008-2, ITU-T Rec. H.265: General slice >segment >> >header */ >> >> >> > > > __u32 slice_segment_addr; >> >> >> > >> >