Re: [PATCH v3 2/4] media: rockchip: Introduce the rkvdec2 driver

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Alex,

On 2024-06-26 11:12, Alex Bee wrote:
> Hi Detlev,
> 
> Am 25.06.24 um 18:56 schrieb Detlev Casanova:
>> Hi Alex,
>>
>> On Sunday, June 23, 2024 5:33:28 A.M. EDT you wrote:
>>> Hi Detlev,
>>>
>>> Am 20.06.24 um 16:19 schrieb Detlev Casanova:
>>>> This driver supports the second generation of the Rockchip Video
>>>> decoder, also known as vdpu34x.
>>>> It is currently only used on the RK3588(s) SoC.
>>>>
>>>> There are 2 decoders on the RK3588 SoC that can work in pair to decode
>>>> 8K video at 30 FPS but currently, only using one core at a time is
>>>> supported.
>>>>
>>>> Scheduling requests between the two cores will be implemented later.
>>>>
>>>> The core supports H264, HEVC, VP9 and AVS2 decoding but this driver
>>>> currently only supports H264.
>>>>
>>>> The driver is based on rkvdec and they may share some code in the
>>>> future.
>>>> The decision to make a different driver is mainly because rkvdec2 has
>>>> more features and can work with multiple cores.
>>>>
>>>> The registers are mapped in a struct in RAM using bitfields. It is IO
>>>> copied to the HW when all values are configured.
>>>> The decision to use such a struct instead of writing buffers one by one
>>>>
>>>> is based on the following reasons:
>>>>    - Rockchip cores are known to misbehave when registers are not written
>>>>    
>>>>      in address order,
>>>>    
>>>>    - Those cores also need the software to write all registers, even if
>>>>    
>>>>      they are written their default values or are not related to the task
>>>>      (this core will not start decoding some H264 frames if some VP9
>>>>      registers are not written to 0)
>>>>    
>>>>    - In the future, to support multiple cores, the scheduler could be
>>>>    
>>>>      optimized by storing the precomputed registers values and copy them
>>>>      to the HW as soos as a core becomes available.
>>>>
>>>> This makes the code more readable and may bring performance improvements
>>>> in future features.
>>>>
>>>> Signed-off-by: Detlev Casanova <detlev.casanova@xxxxxxxxxxxxx>
>>>> ---
>>>>
>>>>    drivers/staging/media/Kconfig                |    1 +
>>>>    drivers/staging/media/Makefile               |    1 +
>>>>    drivers/staging/media/rkvdec2/Kconfig        |   15 +
>>>>    drivers/staging/media/rkvdec2/Makefile       |    3 +
>>>>    drivers/staging/media/rkvdec2/TODO           |    9 +
>>>>    drivers/staging/media/rkvdec2/rkvdec2-h264.c |  739 +++++++++++
>>>>    drivers/staging/media/rkvdec2/rkvdec2-regs.h |  345 +++++
>>>>    drivers/staging/media/rkvdec2/rkvdec2.c      | 1253 ++++++++++++++++++
>>>>    drivers/staging/media/rkvdec2/rkvdec2.h      |  130 ++
>>>>    9 files changed, 2496 insertions(+)
>>>>    create mode 100644 drivers/staging/media/rkvdec2/Kconfig
>>>>    create mode 100644 drivers/staging/media/rkvdec2/Makefile
>>>>    create mode 100644 drivers/staging/media/rkvdec2/TODO
>>>>    create mode 100644 drivers/staging/media/rkvdec2/rkvdec2-h264.c
>>>>    create mode 100644 drivers/staging/media/rkvdec2/rkvdec2-regs.h
>>>>    create mode 100644 drivers/staging/media/rkvdec2/rkvdec2.c
>>>>    create mode 100644 drivers/staging/media/rkvdec2/rkvdec2.h
>>> ...
>>>
>>>> +static inline void rkvdec2_memcpy_toio(void __iomem *dst, void *src,
>>>> size_t len) +{
>>>> +#ifdef CONFIG_ARM64
>>>> +	__iowrite32_copy(dst, src, len);
>>>> +#elif defined(CONFIG_ARM)
>>> I guess that can get an "#else" since memcpy_toio exists for all archs.
>>>
>>>> +	memcpy_toio(dst, src, len);
>>>> +#endif
>>>> +}
>>>> +
>>> ...
>>>
>>>> +	/* Set timeout threshold */
>>>> +	if (pixels < RKVDEC2_1080P_PIXELS)
>>>> +		regs->common.timeout_threshold = RKVDEC2_TIMEOUT_1080p;
>>>> +	else if (pixels < RKVDEC2_4K_PIXELS)
>>>> +		regs->common.timeout_threshold = RKVDEC2_TIMEOUT_4K;
>>>> +	else if (pixels < RKVDEC2_8K_PIXELS)
>>>> +		regs->common.timeout_threshold = RKVDEC2_TIMEOUT_8K;
>>>> +
>>> Did you test if it works with anything > 8K? If so, you propably want to
>>> make the check above
>>>
>>> +	else
>>> +		regs->common.timeout_threshold = RKVDEC2_TIMEOUT_8K;
>>>
>>> Otherwise the timeout may not be set/contain invalid values from any former
>>> stream.
>> That's right, but it would be set to 0 because of the memset.
>> RKVDEC2_TIMEOUT_8K might not be enough for bigger frame sizes, so I'll set it
>> to the maximum value (0xffffffff) when frames are bigger than 8K and also adapt
>> the watchdog time: RKVDEC2_TIMEOUT_8K is around 100 ms, but 0xffffffff is arnoud
>> 5.3 seconds (reg032/axi_clock_freq)
>>
>> I'll do more tests with this as well.
>>
>>> ...
>>>
>>>> +
>>>> +static const struct rkvdec2_coded_fmt_desc rkvdec2_coded_fmts[] = {
>>>> +	{
>>>> +		.fourcc = V4L2_PIX_FMT_H264_SLICE,
>>>> +		.frmsize = {
>>>> +			.min_width = 16,
>>>> +			.max_width =  65520,
>>>> +			.step_width = 16,
>>>> +			.min_height = 16,
>>>> +			.max_height =  65520,
>>>> +			.step_height = 16,
>>>> +		},
>>>> +		.ctrls = &rkvdec2_h264_ctrls,
>>>> +		.ops = &rkvdec2_h264_fmt_ops,
>>>> +		.num_decoded_fmts =
>> ARRAY_SIZE(rkvdec2_h264_decoded_fmts),
>>>> +		.decoded_fmts = rkvdec2_h264_decoded_fmts,
>>>> +		.subsystem_flags =
>> VB2_V4L2_FL_SUPPORTS_M2M_HOLD_CAPTURE_BUF,
>>>> +	},
>>>> +};
>>>> +
>>> Note, that this is also given to userspace (VIDIOC_ENUM_FRAMESIZES) and
>>> this is already incorrect in the old rkvdec driver (and hantro): From
>>> userspace perspective we do not have a restriction in
>>> step_width/step_width, as we are aligning any given width/height to HW
>>> requirements in the driver - what we should give to userspace is
>>> fsize->type = V4L2_FRMSIZE_TYPE_CONTINUOUS; fsize->stepwise.min_height =
>>> 1; fsize->stepwise.min_width = 1; fsize->stepwise.max_height = 65520;
>>> fsize->stepwise.max_width = 65520;
>> Is fsize->stepwise.min_height = 1; and fsize->stepwise.min_width = 1 correct ?
>> Or do you mean fsize->stepwise.step_height = 1; and fsize->stepwise.setp_width
>> = 1 ?
>>
>> It would give this instead:
>>
>> .frmsize = {
>> 	.min_width = 16,
>> 	.max_width =  65520,
>> 	.step_width = 1,
>> 	.min_height = 16,
>> 	.max_height =  65520,
>> 	.step_height = 1,
>> },
>>
>> and .vidioc_enum_framesizes sets fsize->type = V4L2_FRMSIZE_TYPE_CONTINUOUS;
> You can't adapt this here, because this .frmsize is also given to the
> v4l2_apply_frmsize_constraints helper, which does the actual alignment to
> HW requirements and requires the HW step_with and step_height.
> IIRC, we also align framesizes which are below minimum HW requirement, at
> least in rkvdec1 driver and it looks a lot like this is done here the same:
> so this should be .min_height = 1 and .min_width = 1. (I remember because
> there are VP9 conformance tests with very small framesizes). And yes, it
> looks like you've had to set .step_width and .step_height to 1 for
> V4L2_FRMSIZE_TYPE_CONTINUOUS, not sure why that is required.
> 
> So, imho, the final rkvdec2_enum_framesizes should look like
> 
> +static int rkvdec2_enum_framesizes(struct file *file, void *priv,
> +                   struct v4l2_frmsizeenum *fsize)
> ....
> +    fmt = rkvdec2_find_coded_fmt_desc(fsize->pixel_format);
> +    if (!fmt)
> +        return -EINVAL;
> +
> +    fsize->type = V4L2_FRMSIZE_TYPE_CONTINUOUS;
> +    fsize->stepwise.min_height = 1;
> +    fsize->stepwise.max_height = fmt->frmsize.max_height;
> +    fsize->stepwise.min_width = 1;
> +    fsize->stepwise.max_width = fmt->frmsize.max_width;
> +    fsize->stepwise.min_width = 1;
> +    fsize->stepwise.step_height = 1;
> +    fsize->stepwise.step_width = 1;
> +    return 0;
> +}
> 
> Note: Not even build tested :)
> Jonas: maybe you can add a fixup patch to your rkvdec patches as well.

Thanks, will take a closer look and include something for rkvdec high10
v6 later today/tomorrow.

Regards,
Jonas

> 
> Regards,
> 
> Alex
> 
>>> I guess this new driver should be an
>>> opportunity to fix that and distinguish between internal and external
>>> frame size requirements and the .vidioc_enum_framesizes callback should
>>> adapted accordingly. Regards, Alex
>> Detlev.





[Index of Archives]     [Device Tree Compilter]     [Device Tree Spec]     [Linux Driver Backports]     [Video for Linux]     [Linux USB Devel]     [Linux PCI Devel]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Yosemite Backpacking]


  Powered by Linux