Re: [RFC PATCH 1/2] drm/msm/dpu: add dsc helper functions

Dmitry Baryshkov <dmitry.baryshkov@xxxxxxxxxx> · Sun, 26 Feb 2023 15:09:51 +0200

On 26/02/2023 02:47, Abhinav Kumar wrote:
Hi Dmitry

On 2/25/2023 7:23 AM, Dmitry Baryshkov wrote:
On 25/02/2023 02:36, Abhinav Kumar wrote:

On 2/24/2023 3:53 PM, Dmitry Baryshkov wrote:
On Sat, 25 Feb 2023 at 00:26, Abhinav Kumar 
<quic_abhinavk@xxxxxxxxxxx> wrote:
On 2/24/2023 1:36 PM, Dmitry Baryshkov wrote:
24 февраля 2023 г. 23:23:03 GMT+02:00, Abhinav Kumar 
<quic_abhinavk@xxxxxxxxxxx> пишет:
On 2/24/2023 1:13 PM, Dmitry Baryshkov wrote:
On 24/02/2023 21:40, Kuogee Hsieh wrote:
Add DSC helper functions based on DSC configuration profiles to 
produce
DSC related runtime parameters through both table look up and 
runtime
calculation to support DSC on DPU.

There are 6 different DSC configuration profiles are supported 
currently.
DSC configuration profiles are differiented by 5 keys, DSC 
version (V1.1),
chroma (444/422/420), colorspace (RGB/YUV), bpc(8/10),
bpp (6/7/7.5/8/9/10/12/15) and SCR (0/1).

Only DSC version V1.1 added and V1.2 will be added later.

These helpers should go to drivers/gpu/drm/display/drm_dsc_helper.c
Also please check that they can be used for i915 or for amdgpu 
(ideally for both of them).

No, it cannot. So each DSC encoder parameter is calculated based 
on the HW core which is being used.

They all get packed to the same DSC structure which is the struct 
drm_dsc_config but the way the parameters are computed is 
specific to the HW.

This DPU file helper still uses the drm_dsc_helper's 
drm_dsc_compute_rc_parameters() like all other vendors do but the 
parameters themselves are very HW specific and belong to each 
vendor's dir.

This is not unique to MSM.

Lets take a few other examples:

AMD: 
https://gitlab.freedesktop.org/drm/msm/-/blob/msm-next/drivers/gpu/drm/amd/display/dc/dml/dsc/rc_calc_fpu.c#L165

i915: 
https://gitlab.freedesktop.org/drm/msm/-/blob/msm-next/drivers/gpu/drm/i915/display/intel_vdsc.c#L379

I checked several values here. Intel driver defines more bpc/bpp 
combinations, but the ones which are defined in intel_vdsc and in 
this patch seem to match. If there are major differences there, 
please point me to the exact case.

I remember that AMD driver might have different values.

Some values in the rc_params table do match. But the 
rc_buf_thresh[] doesnt.

Because later they do:

vdsc_cfg->rc_buf_thresh[i] = rc_buf_thresh[i] >> 6;

https://gitlab.freedesktop.org/drm/msm/-/blob/msm-next/drivers/gpu/drm/i915/display/intel_vdsc.c#L40

Vs

+static u16 dpu_dsc_rc_buf_thresh[DSC_NUM_BUF_RANGES - 1] = {
+               0x0e, 0x1c, 0x2a, 0x38, 0x46, 0x54,
+               0x62, 0x69, 0x70, 0x77, 0x79, 0x7b, 0x7d, 0x7e
+};

I'd prefer to have 896, 1792, etc. here, as those values come from the
standard. As it's done in the Intel driver.

Got it, thanks

I dont know the AMD calculation very well to say that moving this 
to the
helper is going to help.

Those calculations correspond (more or less) at the first glance to
what intel does for their newer generations. I think that's not our
problem for now.

Well, we have to figure out if each value matches and if each of them 
come from the spec for us and i915 and from which section. So it is 
unfortunately our problem.

Otherwise it will have to be handled by Marijn, me or anybody else 
wanting to hack up the DSC code. Or by anybody adding DSC support to 
the next platform and having to figure out the difference between 
i915, msm and their platform.

Yes, I wonder why the same doubt didn't arise when the other vendors 
added their support both from other maintainers and others.

Which makes me think that like I wrote in my previous response, these 
are "recommended" values in the spec but its not mandatory.

I think, it is because there were no other drivers to compare. In other 
words, for a first driver it is pretty logical to have everything 
handled on its own. As soon as we start getting other implementations of 
a feature, it becomes logical to think if the code can be generalized. 
This is what we see we with the HDCP series or with the code being moved 
to DP helpers.

Moving this to the drm_dsc_helper is generalizing the tables and not 
giving room for the vendors to customize even if they want to (which the 
spec does allow).

That depends on the API you select. For example, in 
intel_dsc_compute_params() I see customization being applied to 
rc_buf_thresh in 6bpp case. I'd leave that to the i915 driver.

In case the driver needs to perform customization of the params, nothing 
stops it drop applying after filling all the RC params in the 
drm_dsc_config struct via the generic helper.

So if this has any merit and if you or Marijn would like to take it up, 
go for it. We would do the same thing as either of you would have to in 
terms of figuring out the difference between msm and the i915 code.

This is not a generic API we are trying to put in a helper, these are 
hard-coded tables so there is a difference between looking at these Vs 
looking at some common code which can move to the core.

Also, i think its too risky to change other drivers to use whatever 
math
we put in the drm_dsc_helper to compute thr RC params because their 
code
might be computing and using this tables differently.

Its too much ownership for MSM developers to move this to 
drm_dsc_helper
and own that as it might cause breakage of basic DSC even if some 
values
are repeated.

It's time to stop thinking about ownership and start thinking about
shared code. We already have two instances of DSC tables. I don't
think having a third instance, which is a subset of an existing
dataset, would be beneficial to anybody.
AMD has complicated code which supports half-bit bpp and calculates
some of the parameters. But sharing data with the i915 driver is
straightforward.

Sorry, but I would like to get an ack from i915 folks if this is going
to be useful to them if we move this to helper because we have to 
look at every table. Not just one.

Added i915 maintainers to the CC list for them to be able to answer.

Thanks, lets wait to hear from them about where finally these tables 
should go but thats can be taken up as a separate effort too.

Also, this is just 1.1, we will add more tables for 1.2. So we will 
have to end up changing both 1.1 and 1.2 tables as they are different 
for QC.

I haven't heard back from Kuogee about the possible causes of using 
rc/qp values from 1.2 even for 1.1 panels. Maybe you can comment on 
that? In other words, can we always stick to the values from 1.2 
standard? What will be the drawback?

Otherwise, we'd have to have two different sets of values, like you do 
in your vendor driver.

I have responded to this in the other email.

All this being said, even if the rc tables move the drm_dsc_helper 
either now or later on, we will still need MSM specific calculations for 
many of the other encoder parameters (which are again either hard-coded 
or calculated). Please refer to the sde_dsc_populate_dsc_config() 
downstream. And yes, you will not find those in the DP spec directly.

So we will still need a dsc helper for MSM calculations to be common for 
DSI / DP irrespective of where the tables go.

So, lets finalize that first.

I went on and trimmed sde_dsc_populate_dsc_config() to remove 
duplication with the drm_dsc_compute_rc_parameters() (which we already 
use for the MSM DSI DSC).

Not much is left:

dsc->first_line_bpg_offset set via the switch

dsc->line_buf_depth = bpc + 1;
dsc->mux_word_size = bpc > 10 ? DSC_MUX_WORD_SIZE_12_BPC:
        DSC_MUX_WORD_SIZE_8_10_BPC;

if ((dsc->dsc_version_minor == 0x2) && (dsc->native_420))
    dsc->nsl_bpg_offset = (2048 *
             (DIV_ROUND_UP(dsc->second_line_bpg_offset,
                                (dsc->slice_height - 1))));

dsc->initial_scale_value = 8 * dsc->rc_model_size /
                        (dsc->rc_model_size - dsc->initial_offset);

mux_word_size comes from the standard (must)
initial_scale_value calculation is recommended, but not required
nsl_bpg_offset follows the standard (must), also see below (*).

first_line_bpg_offset calculation differs between three drivers. The 
standard also provides a recommended formulas. I think we can leave it 
as is for now.

I think, that mux_word_size and nsl_bpg_offset calculation should be 
moved to drm_dsc_compute_rc_parameters(), while leaving 
initial_scale_value in place (in the driver code).

* I think nsl_bpg_offset is slightly incorrectly calculated. Standard 
demands that it is set to 'second_line_bpg_offset / (slice_height - 1), 
rounded up to 16 fraction bits', while SDE driver code sets it to the 
value rounded up to the next integer (having 16 fraction bits 
representation).

In my opinion correct calculation should be:
dsc->nsl_bpg_offset = DIV_ROUND_UP(2048 * dsc->second_line_bpg_offset,
                                (dsc->slice_height - 1));

Could you please check, which one is correct according to the standard?

--
With best wishes
Dmitry