Re: [PATCH v3 2/4] media: rockchip: Introduce the rkvdec2 driver

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Alex,

On Sunday, June 23, 2024 5:33:28 A.M. EDT you wrote:
> Hi Detlev,
> 
> Am 20.06.24 um 16:19 schrieb Detlev Casanova:
> > This driver supports the second generation of the Rockchip Video
> > decoder, also known as vdpu34x.
> > It is currently only used on the RK3588(s) SoC.
> > 
> > There are 2 decoders on the RK3588 SoC that can work in pair to decode
> > 8K video at 30 FPS but currently, only using one core at a time is
> > supported.
> > 
> > Scheduling requests between the two cores will be implemented later.
> > 
> > The core supports H264, HEVC, VP9 and AVS2 decoding but this driver
> > currently only supports H264.
> > 
> > The driver is based on rkvdec and they may share some code in the
> > future.
> > The decision to make a different driver is mainly because rkvdec2 has
> > more features and can work with multiple cores.
> > 
> > The registers are mapped in a struct in RAM using bitfields. It is IO
> > copied to the HW when all values are configured.
> > The decision to use such a struct instead of writing buffers one by one
> > 
> > is based on the following reasons:
> >   - Rockchip cores are known to misbehave when registers are not written
> >   
> >     in address order,
> >   
> >   - Those cores also need the software to write all registers, even if
> >   
> >     they are written their default values or are not related to the task
> >     (this core will not start decoding some H264 frames if some VP9
> >     registers are not written to 0)
> >   
> >   - In the future, to support multiple cores, the scheduler could be
> >   
> >     optimized by storing the precomputed registers values and copy them
> >     to the HW as soos as a core becomes available.
> > 
> > This makes the code more readable and may bring performance improvements
> > in future features.
> > 
> > Signed-off-by: Detlev Casanova <detlev.casanova@xxxxxxxxxxxxx>
> > ---
> > 
> >   drivers/staging/media/Kconfig                |    1 +
> >   drivers/staging/media/Makefile               |    1 +
> >   drivers/staging/media/rkvdec2/Kconfig        |   15 +
> >   drivers/staging/media/rkvdec2/Makefile       |    3 +
> >   drivers/staging/media/rkvdec2/TODO           |    9 +
> >   drivers/staging/media/rkvdec2/rkvdec2-h264.c |  739 +++++++++++
> >   drivers/staging/media/rkvdec2/rkvdec2-regs.h |  345 +++++
> >   drivers/staging/media/rkvdec2/rkvdec2.c      | 1253 ++++++++++++++++++
> >   drivers/staging/media/rkvdec2/rkvdec2.h      |  130 ++
> >   9 files changed, 2496 insertions(+)
> >   create mode 100644 drivers/staging/media/rkvdec2/Kconfig
> >   create mode 100644 drivers/staging/media/rkvdec2/Makefile
> >   create mode 100644 drivers/staging/media/rkvdec2/TODO
> >   create mode 100644 drivers/staging/media/rkvdec2/rkvdec2-h264.c
> >   create mode 100644 drivers/staging/media/rkvdec2/rkvdec2-regs.h
> >   create mode 100644 drivers/staging/media/rkvdec2/rkvdec2.c
> >   create mode 100644 drivers/staging/media/rkvdec2/rkvdec2.h
> 
> ...
> 
> > +static inline void rkvdec2_memcpy_toio(void __iomem *dst, void *src,
> > size_t len) +{
> > +#ifdef CONFIG_ARM64
> > +	__iowrite32_copy(dst, src, len);
> > +#elif defined(CONFIG_ARM)
> 
> I guess that can get an "#else" since memcpy_toio exists for all archs.
> 
> > +	memcpy_toio(dst, src, len);
> > +#endif
> > +}
> > +
> 
> ...
> 
> > +	/* Set timeout threshold */
> > +	if (pixels < RKVDEC2_1080P_PIXELS)
> > +		regs->common.timeout_threshold = RKVDEC2_TIMEOUT_1080p;
> > +	else if (pixels < RKVDEC2_4K_PIXELS)
> > +		regs->common.timeout_threshold = RKVDEC2_TIMEOUT_4K;
> > +	else if (pixels < RKVDEC2_8K_PIXELS)
> > +		regs->common.timeout_threshold = RKVDEC2_TIMEOUT_8K;
> > +
> 
> Did you test if it works with anything > 8K? If so, you propably want to
> make the check above
> 
> +	else
> +		regs->common.timeout_threshold = RKVDEC2_TIMEOUT_8K;
> 
> Otherwise the timeout may not be set/contain invalid values from any former
> stream.

That's right, but it would be set to 0 because of the memset. 
RKVDEC2_TIMEOUT_8K might not be enough for bigger frame sizes, so I'll set it 
to the maximum value (0xffffffff) when frames are bigger than 8K and also adapt 
the watchdog time: RKVDEC2_TIMEOUT_8K is around 100 ms, but 0xffffffff is arnoud 
5.3 seconds (reg032/axi_clock_freq)

I'll do more tests with this as well.

> ...
> 
> > +
> > +static const struct rkvdec2_coded_fmt_desc rkvdec2_coded_fmts[] = {
> > +	{
> > +		.fourcc = V4L2_PIX_FMT_H264_SLICE,
> > +		.frmsize = {
> > +			.min_width = 16,
> > +			.max_width =  65520,
> > +			.step_width = 16,
> > +			.min_height = 16,
> > +			.max_height =  65520,
> > +			.step_height = 16,
> > +		},
> > +		.ctrls = &rkvdec2_h264_ctrls,
> > +		.ops = &rkvdec2_h264_fmt_ops,
> > +		.num_decoded_fmts = 
ARRAY_SIZE(rkvdec2_h264_decoded_fmts),
> > +		.decoded_fmts = rkvdec2_h264_decoded_fmts,
> > +		.subsystem_flags = 
VB2_V4L2_FL_SUPPORTS_M2M_HOLD_CAPTURE_BUF,
> > +	},
> > +};
> > +
> 
> Note, that this is also given to userspace (VIDIOC_ENUM_FRAMESIZES) and
> this is already incorrect in the old rkvdec driver (and hantro): From
> userspace perspective we do not have a restriction in
> step_width/step_width, as we are aligning any given width/height to HW
> requirements in the driver - what we should give to userspace is
> fsize->type = V4L2_FRMSIZE_TYPE_CONTINUOUS; fsize->stepwise.min_height =
> 1; fsize->stepwise.min_width = 1; fsize->stepwise.max_height = 65520;
> fsize->stepwise.max_width = 65520; 

Is fsize->stepwise.min_height = 1; and fsize->stepwise.min_width = 1 correct ?
Or do you mean fsize->stepwise.step_height = 1; and fsize->stepwise.setp_width 
= 1 ? 

It would give this instead:

.frmsize = {
	.min_width = 16,
	.max_width =  65520,
	.step_width = 1,
	.min_height = 16,
	.max_height =  65520,
	.step_height = 1,
},

and .vidioc_enum_framesizes sets fsize->type = V4L2_FRMSIZE_TYPE_CONTINUOUS;

> I guess this new driver should be an
> opportunity to fix that and distinguish between internal and external
> frame size requirements and the .vidioc_enum_framesizes callback should
> adapted accordingly. Regards, Alex

Detlev.

Attachment: signature.asc
Description: This is a digitally signed message part.


[Index of Archives]     [Device Tree Compilter]     [Device Tree Spec]     [Linux Driver Backports]     [Video for Linux]     [Linux USB Devel]     [Linux PCI Devel]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Yosemite Backpacking]


  Powered by Linux