Re: [PATCH v2 4/4] arm64: dts: rockchip: Add rkvdec2 Video Decoder on rk3588(s)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Datlev,

On 2024-06-27 22:56, Detlev Casanova wrote:
> Hi Jonas,
> 
> On Monday, June 24, 2024 5:16:33 A.M. EDT Jonas Karlman wrote:
>> Hi Detlev and Alex,
>>
>> On 2024-06-20 15:31, Detlev Casanova wrote:
>>> Hi Jonas, Alex,
>>>
>>> On Wednesday, June 19, 2024 2:06:40 P.M. EDT Jonas Karlman wrote:
>>>> Hi Alex,
>>>>
>>>> On 2024-06-19 19:19, Alex Bee wrote:
>>>>> Am 19.06.24 um 17:28 schrieb Jonas Karlman:
>>>>>> Hi Detlev,
>>>>>>
>>>>>> On 2024-06-19 16:57, Detlev Casanova wrote:
>>>>>>> Add the rkvdec2 Video Decoder to the RK3588s devicetree.
>>>>>>>
>>>>>>> Signed-off-by: Detlev Casanova <detlev.casanova@xxxxxxxxxxxxx>
>>>>>>> ---
>>>>>>>
>>>>>>>   arch/arm64/boot/dts/rockchip/rk3588s.dtsi | 50
>>>>>>>   +++++++++++++++++++++++
>>>>>>>   1 file changed, 50 insertions(+)
>>>>>>>
>>>>>>> diff --git a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
>>>>>>> b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi index
>>>>>>> 6ac5ac8b48ab..7690632f57f1 100644
>>>>>>> --- a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
>>>>>>> +++ b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
>>>>>>> @@ -2596,6 +2596,16 @@ system_sram2: sram@ff001000 {
>>>>>>>
>>>>>>>   		ranges = <0x0 0x0 0xff001000 0xef000>;
>>>>>>>   		#address-cells = <1>;
>>>>>>>   		#size-cells = <1>;
>>>>>>>
>>>>>>> +
>>>>>>> +		vdec0_sram: rkvdec-sram@0 {
>>>>>>> +			reg = <0x0 0x78000>;
>>>>>>> +			pool;
>>>>>>> +		};
>>>>>>> +
>>>>>>> +		vdec1_sram: rkvdec-sram@1 {
>>>>>>> +			reg = <0x78000 0x77000>;
>>>>>>> +			pool;
>>>>>>> +		};
>>>>>>>
>>>>>>>   	};
>>>>>>>   	
>>>>>>>   	pinctrl: pinctrl {
>>>>>>>
>>>>>>> @@ -2665,6 +2675,46 @@ gpio4: gpio@fec50000 {
>>>>>>>
>>>>>>>   			#interrupt-cells = <2>;
>>>>>>>   		
>>>>>>>   		};
>>>>>>>   	
>>>>>>>   	};
>>>>>>>
>>>>>>> +
>>>>>>> +	vdec0: video-decoder@fdc38100 {
>>>>>>> +		compatible = "rockchip,rk3588-vdec";
>>>>>>> +		reg = <0x0 0xfdc38100 0x0 0x500>;
>>>>>>> +		interrupts = <GIC_SPI 95 IRQ_TYPE_LEVEL_HIGH 0>;
>>>>>>> +		clocks = <&cru ACLK_RKVDEC0>, <&cru HCLK_RKVDEC0>,
>>>
>>> <&cru
>>>
>>>>>>> CLK_RKVDEC0_CA>, +			 <&cru
>>>
>>> CLK_RKVDEC0_CORE>, <&cru
>>>
>>>>>>> CLK_RKVDEC0_HEVC_CA>;
>>>>>>> +		clock-names = "axi", "ahb", "cabac", "core",
>>>
>>> "hevc_cabac";
>>>
>>>>>>> +		assigned-clocks = <&cru ACLK_RKVDEC0>, <&cru
>>>
>>> CLK_RKVDEC0_CORE>,
>>>
>>>>>>> +				  <&cru CLK_RKVDEC0_CA>, <&cru
>>>
>>> CLK_RKVDEC0_HEVC_CA>;
>>>
>>>>>>> +		assigned-clock-rates = <800000000>, <600000000>,
>>>>>>> +				       <600000000>, <1000000000>;
>>>>>>> +		resets = <&cru SRST_A_RKVDEC0>, <&cru SRST_H_RKVDEC0>,
>>>
>>> <&cru
>>>
>>>>>>> SRST_RKVDEC0_CA>, +			 <&cru
>>>
>>> SRST_RKVDEC0_CORE>, <&cru
>>>
>>>>>>> SRST_RKVDEC0_HEVC_CA>;
>>>>>>> +		reset-names = "rst_axi", "rst_ahb", "rst_cabac",
>>>>>>> +			      "rst_core", "rst_hevc_cabac";
>>>>>>> +		power-domains = <&power RK3588_PD_RKVDEC0>;
>>>>>>> +		sram = <&vdec0_sram>;
>>>>>>> +		status = "okay";
>>>>>>> +	};
>>>>>>> +
>>>>>>> +	vdec1: video-decoder@fdc40100 {
>>>>>>> +		compatible = "rockchip,rk3588-vdec";
>>>>>>> +		reg = <0x0 0xfdc40100 0x0 0x500>;
>>>>>>> +		interrupts = <GIC_SPI 97 IRQ_TYPE_LEVEL_HIGH 0>;
>>>>>>> +		clocks = <&cru ACLK_RKVDEC1>, <&cru HCLK_RKVDEC1>,
>>>
>>> <&cru
>>>
>>>>>>> CLK_RKVDEC1_CA>, +			 <&cru
>>>
>>> CLK_RKVDEC1_CORE>, <&cru
>>>
>>>>>>> CLK_RKVDEC1_HEVC_CA>;
>>>>>>> +		clock-names = "axi", "ahb", "cabac", "core",
>>>
>>> "hevc_cabac";
>>>
>>>>>>> +		assigned-clocks = <&cru ACLK_RKVDEC1>, <&cru
>>>
>>> CLK_RKVDEC1_CORE>,
>>>
>>>>>>> +				  <&cru CLK_RKVDEC1_CA>, <&cru
>>>
>>> CLK_RKVDEC1_HEVC_CA>;
>>>
>>>>>>> +		assigned-clock-rates = <800000000>, <600000000>,
>>>>>>> +				       <600000000>, <1000000000>;
>>>>>>> +		resets = <&cru SRST_A_RKVDEC1>, <&cru SRST_H_RKVDEC1>,
>>>
>>> <&cru
>>>
>>>>>>> SRST_RKVDEC1_CA>, +			 <&cru
>>>
>>> SRST_RKVDEC1_CORE>, <&cru
>>>
>>>>>>> SRST_RKVDEC1_HEVC_CA>;
>>>>>>> +		reset-names = "rst_axi", "rst_ahb", "rst_cabac",
>>>>>>> +			      "rst_core", "rst_hevc_cabac";
>>>>>>> +		power-domains = <&power RK3588_PD_RKVDEC1>;
>>>>>>> +		sram = <&vdec1_sram>;
>>>>>>> +		status = "okay";
>>>>>>> +	};
>>>>>>
>>>>>> This is still missing the iommus, please add the iommus, they should be
>>>>>>
>>>>>> supported/same as the one used for e.g. VOP2:
>>>>>>    compatible = "rockchip,rk3588-iommu", "rockchip,rk3568-iommu";
>>>>>>
>>>>>> The VOP2 MMUs does have one extra mmu_cfg_mode flag in AUTO_GATING,
>>>>>> compared to the VDPU381 MMUs, however only the AV1D MMU should be
>>>>>> special on RK3588.
>>>>>>
>>>>>> Please add the iommus :-)
>>>>>
>>>>> When looking add the vendor DT/iommu driver I'm seeing serval quirks
>>>>> applied for vdec's iommus. Since it's rightly frowned upon adding such
>>>>> boolean-quirk-properties to upstream devicetrees, we'd at least need
>>>>> additional (fallback-) compatibles, even if it works with the iommu
>>>>> driver
>>>>> as is (what I doubt, but haven't tested). We need to be able to apply
>>>>> those
>>>>> quirks later without changing the devicetree (as usual) and I'm sure RK
>>>>> devs haven't added these quirks for the personal amusement.
>>>>
>>>> Based on what I investigated the hw should work similar, and the quirks
>>>> mostly seem related to optimizations and sw quirks, like do not zap each
>>>> line, keep it alive even when pm runtime say it is not in use and other
>>>> quirks that seem to be more of sw nature on how to best utilize the hw.
>>>
>>> I did some testing with the IOMMU but unfortunately, I'm only getting page
>>> fault errors. This may be something I'm doing wrong, but it clearly needs
>>> more investigation.
>>
>> I re-tested and the addition of sram seem to now cause page faults, the
>> sram also need to be mapped in the iommu.
>>
>> However, doing more testing revealed that use of iommu present the same
>> issue as seen with hevc on rk3399, after a fail fluster tests continue
>> to fail until a reset.
>>
>> Seeing how this issue was very similar I re-tested on rk3399 without
>> iommu and cma=1G and could observe that there was no longer any need to
>> reset after a failed test. Interestingly the score also went up from
>> 135 to 137/147.
>>
>> Digging some more revealed that the iommu also is reset during the
>> internal rkvdec soft reset on error, leaving the iommu with dte_addr=0
>> and paging in disabled state.
>>
>> Ensuring that the iommu was reconfigured after a failure fixed the issue
>> observed on rk3399 and I now also get 137/147 hevc fluster score using
>> the iommu.
>>
>> Will send out a rkvdec hevc v2 series after some more testing.
>>
>> Guessing there is a similar need to reconfigure iommu on rk3588, and my
>> initial tests also showed promising result, however more tests are
>> needed.
> 
> I did some testing with the IOMMU. The good news is that it now works with the 
> SRAM.

Great, I did not look into SRAM at all, just replaced sram prop with iommus for
my tests, so great that you found a way to make it work with the iommu :-)

> I am also able to hack the iommu driver to force a reset in case of an error 
> in the decoder. I'm not sure how to implement that with the IOMMU kernel API 
> though.

I am planning on sending something along the way of this as an RFC:

https://github.com/Kwiboo/linux-rockchip/compare/6da640232631...bf332524d880

If we re-configure and re-enable the iommu just before next decoding run
after a decoding has failed seem to resolve any issue I have seen, have
mainly been tested with rkvdec and HEVC on RK3399/RK3328. On RK3588 this
also seemed to work, at least when I tested earlier this week.

> 
> Another issue is that resetting the iommu will drop all buffer addresses of 
> other decoding contexts that may be running in parallel.

I do not think we need/should reset the iommu, we just need to deal with
the fact that the rkvdec will reset and disable use of the mmu when it
reset itself.

> 
> I *think* that the downstream mpp remaps the buffers in the iommu for each 
> frame, but I'm not sure about that either.

As long as a frame can be decoded correctly, the mmu config seem to continue
to be valid and next frame can be decoded.

> 
> So running fluster with `-j 1` gives me the expected 129/135 passed tests, but 
> `-j 8` will start failing all tests after the first fail (well, first fail 
> because of decoder error).

This was the main issue blocking rkvdec hevc, just re-confgure the mmu
after a frame fails to decode seem to resolve this issue.

Biggest issue at the moment is how to properly signal iommu subsystem that
it should re-configure, I may have abused the flush_iotlb_all ops, since
that seemed closest existing hook.

Will send an RFC to linux-iommu to collect input on how to best signal
iommu subsystem that the mmu has been reset by an external event and now
need to be re-configured.

Regards,
Jonas

> 
>> Regards,
>> Jonas
>>
>>>>> If Detlev says
>>>>> iommu is out of scope for this series (which is valid), I'd say it's
>>>>> fine
>>>>> to leave them out for now (as no binding exists) and the HW works
>>>>> (obviously) fine without them.
>>>>
>>>> Sure, use of MMU can be added later.
>>>
>>> I'd rather go for that for now. I'll add that IMMU support is missing in
>>> the TODO file.
>>>
>>>> Regards,
>>>> Jonas
>>>>
>>>>>> Regards,
>>>>>> Jonas
>>>>>>
>>>>>>>   };
>>>>>>>   
>>>>>>>   #include "rk3588s-pinctrl.dtsi"
> 





[Index of Archives]     [Linux Driver Development]     [Linux Driver Backports]     [DMA Engine]     [Linux GPIO]     [Linux SPI]     [Video for Linux]     [Linux USB Devel]     [Linux Coverity]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Yosemite Backpacking]
  Powered by Linux