On Tue, 14 Sep 2021, "Kulkarni, Vandita" <vandita.kulkarni@xxxxxxxxx> wrote: >> -----Original Message----- >> From: Nikula, Jani <jani.nikula@xxxxxxxxx> >> Sent: Tuesday, September 14, 2021 7:33 PM >> To: Lisovskiy, Stanislav <stanislav.lisovskiy@xxxxxxxxx> >> Cc: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx>; Kulkarni, Vandita >> <vandita.kulkarni@xxxxxxxxx>; intel-gfx@xxxxxxxxxxxxxxxxxxxxx; Navare, >> Manasi D <manasi.d.navare@xxxxxxxxx> >> Subject: Re: [PATCH] drm/i915/display: Enable second VDSC >> engine for higher moderates >> >> On Tue, 14 Sep 2021, "Lisovskiy, Stanislav" <stanislav.lisovskiy@xxxxxxxxx> >> wrote: >> > On Tue, Sep 14, 2021 at 04:04:25PM +0300, Lisovskiy, Stanislav wrote: >> >> On Tue, Sep 14, 2021 at 03:04:11PM +0300, Jani Nikula wrote: >> >> > On Tue, 14 Sep 2021, "Lisovskiy, Stanislav" >> <stanislav.lisovskiy@xxxxxxxxx> wrote: >> >> > > On Tue, Sep 14, 2021 at 10:48:46AM +0300, Ville Syrjälä wrote: >> >> > >> On Tue, Sep 14, 2021 at 07:31:46AM +0000, Kulkarni, Vandita wrote: >> >> > >> > > -----Original Message----- >> >> > >> > > From: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> >> >> > >> > > Sent: Tuesday, September 14, 2021 12:59 PM >> >> > >> > > To: Kulkarni, Vandita <vandita.kulkarni@xxxxxxxxx> >> >> > >> > > Cc: intel-gfx@xxxxxxxxxxxxxxxxxxxxx; Nikula, Jani >> >> > >> > > <jani.nikula@xxxxxxxxx>; Navare, Manasi D >> >> > >> > > <manasi.d.navare@xxxxxxxxx> >> >> > >> > > Subject: Re: [PATCH] drm/i915/display: Enable >> >> > >> > > second VDSC engine for higher moderates >> >> > >> > > >> >> > >> > > On Mon, Sep 13, 2021 at 08:09:23PM +0530, Vandita Kulkarni >> wrote: >> >> > >> > > > Each VDSC operates with 1ppc throughput, hence enable the >> >> > >> > > > second VDSC engine when moderate is higher that the current >> cdclk. >> >> > >> > > > >> >> > >> > > > Signed-off-by: Vandita Kulkarni >> >> > >> > > > <vandita.kulkarni@xxxxxxxxx> >> >> > >> > > > --- >> >> > >> > > > drivers/gpu/drm/i915/display/intel_dp.c | 12 ++++++++++-- >> >> > >> > > > 1 file changed, 10 insertions(+), 2 deletions(-) >> >> > >> > > > >> >> > >> > > > diff --git a/drivers/gpu/drm/i915/display/intel_dp.c >> >> > >> > > > b/drivers/gpu/drm/i915/display/intel_dp.c >> >> > >> > > > index 161c33b2c869..55878f65f724 100644 >> >> > >> > > > --- a/drivers/gpu/drm/i915/display/intel_dp.c >> >> > >> > > > +++ b/drivers/gpu/drm/i915/display/intel_dp.c >> >> > >> > > > @@ -70,6 +70,7 @@ >> >> > >> > > > #include "intel_tc.h" >> >> > >> > > > #include "intel_vdsc.h" >> >> > >> > > > #include "intel_vrr.h" >> >> > >> > > > +#include "intel_cdclk.h" >> >> > >> > > > >> >> > >> > > > #define DP_DPRX_ESI_LEN 14 >> >> > >> > > > >> >> > >> > > > @@ -1291,10 +1292,13 @@ static int >> >> > >> > > > intel_dp_dsc_compute_config(struct >> >> > >> > > intel_dp *intel_dp, >> >> > >> > > > struct drm_connector_state >> *conn_state, >> >> > >> > > > struct link_config_limits *limits) { >> >> > >> > > > + struct intel_cdclk_state *cdclk_state; >> >> > >> > > > struct intel_digital_port *dig_port = >> dp_to_dig_port(intel_dp); >> >> > >> > > > struct drm_i915_private *dev_priv = to_i915(dig_port- >> >> > >> > > >base.base.dev); >> >> > >> > > > const struct drm_display_mode *adjusted_mode = >> >> > >> > > > &pipe_config->hw.adjusted_mode; >> >> > >> > > > + struct intel_atomic_state *state = >> >> > >> > > > + to_intel_atomic_state(pipe_config- >> >> > >> > > >uapi.state); >> >> > >> > > > int pipe_bpp; >> >> > >> > > > int ret; >> >> > >> > > > >> >> > >> > > > @@ -1373,12 +1377,16 @@ static int >> >> > >> > > > intel_dp_dsc_compute_config(struct >> >> > >> > > intel_dp *intel_dp, >> >> > >> > > > } >> >> > >> > > > } >> >> > >> > > > >> >> > >> > > > + cdclk_state = intel_atomic_get_cdclk_state(state); >> >> > >> > > > + if (IS_ERR(cdclk_state)) >> >> > >> > > > + return PTR_ERR(cdclk_state); >> >> > >> > > > + >> >> > >> > > > /* >> >> > >> > > > * VDSC engine operates at 1 Pixel per clock, so if peak pixel >> rate >> >> > >> > > > - * is greater than the maximum Cdclock and if slice count is >> even >> >> > >> > > > + * is greater than the current Cdclock and if slice >> >> > >> > > > +count is even >> >> > >> > > > * then we need to use 2 VDSC instances. >> >> > >> > > > */ >> >> > >> > > > - if (adjusted_mode->crtc_clock > dev_priv->max_cdclk_freq >> || >> >> > >> > > > + if (adjusted_mode->crtc_clock > >> >> > >> > > > +cdclk_state->actual.cdclk || >> >> > >> > > >> >> > >> > > This is wrong. We compute the cdclk based on the >> >> > >> > > requirements of the mode/etc., not the other way around. >> >> > > >> >> > > According to BSpec guideline, we decide whether we enable or >> >> > > disable second VDSC engine, based on that condition. As I >> >> > > understand that one is about DSC config calculation, based on CDCLK >> which was calculated. >> >> > >> >> > Point is, at the time compute_config gets called, what guarantees >> >> > are there that cdclk_state->actual.cdclk contains anything useful? >> >> > This is the design we have. >> >> >> >> That is actually good question, was willing to check that as well. >> >> >> >> > >> >> > > If we bump up CDCLK, to avoid this, will we even then use a second >> VDSC ever? >> >> > >> >> > I think we'll eventually need better logic than unconditionally >> >> > bumping to max, and it needs to take *both* the cdclk and the >> >> > number of dsc engines into account. The referenced bspec only has >> >> > the vdsc clock perspective, not overall perspective. >> >> >> >> What we need to clarify here is that how this is supposed to work in >> theory. >> >> Basically same issue can be fixed by both increasing the CDCLK or >> >> enabling 2nd VDSC engine. >> >> There should be some guideline telling us, how to prioritize. >> >> From overall perspective as I understand, by default, we are able to >> >> keep CDCLK 2 times less than pixel rate(see >> >> intel_pixel_rate_to_cdclk), however due to that VDSC limitation that >> >> it can use only 1 ppc this becomes, not applicable anymore(at least >> >> as of BSpec 49259), so we have to increase amount of VDSC instances >> then. >> >> >> >> So the question is now - what is more optimal here? >> >> Also if we bump up CDCLK(which we have done many times already in >> >> fact), we then need to add some logic to intel_compute_min_cdclk to >> >> check if we are using DSC or not, because otherwise we don't really need >> to do that. >> >> intel_compute_min_cdclk() already needs to be dsc aware when slice count >> is 1 and we can't use two dsc engines anyway. See the recent commit >> fe01883fdcef ("drm/i915: Get proper min cdclk if vDSC enabled"). >> >> Looking again, I'm not sure that does the right decision for when >> dsc.slice_count > 1, but dsc.split == false. It should probably use dsc.split for >> the decision. >> >> >> >> >> Stan >> > >> > Checked and indeed, encoder->compute_config is called way before, >> > basically CDCLK calculation is called almost in the end of >> > atomic_check, so in compute_config, there would be an old CDCLK value >> > copied from previous cdclk state, but not the last one. >> > >> > Vandita, this means we actually can't do it that way, if you want to >> > do anything with VDSC based on CDCLK this has to be done _after_ >> > intel_compute_min_cdclk was called. Which is not very sweet, I guess. >> > >> > So as of current architecture, it seems that the easiest way is indeed >> > to bump the CDCLK or we need to figure the way how to enable 2nd VDSC >> > somewhere else, after CDCLK was calculated. >> >> Alternatively, we could use two dsc engines more aggressively, but that >> decision currently can't take overall chosen cdclk into account. >> >> We'll end up sometimes unnecessarily using a too high cdclk or two dsc >> engines, just have to pick the poison. >> >> I think trying to do dsc decisions after intel_compute_min_cdclk() gets way >> too complicated. > > In this case, can we just use the 2nd VDSC engine if slice_count is 2 or more? > Which would mean we always operate in joiner enabled mode(small joiner) of all the compression modes of operation > mentioned in the table bspec: 49259 > Because we are still going to hit the max cdclk restriction for higher resolutions, and many lower resolutions wouldn’t need max cdclk. > And eventually once we have more details on cd clk vs 2VDSC engine we could add > the logic to choose one over the other? > > I see that in case of DSI we do split = true, for slice_count > 1 but that would need a different set of checks, thats a TBD. > > Or Do you suggest I just do this for now max cdclk when slice_count =1 (what we are doing now) replace with compression = true and split = false I think the check in intel_compute_min_cdclk() should be: if (crtc_state->dsc.compression_enable && !crtc_state->dsc.dsc_split) That's a separate change. Enabling two dsc engines more aggressively... I don't mind doing it unconditionally when slice count > 1 for starters. But I think we'll need to improve this going forward, including fixing the mode valid checks etc. as we've discussed. Ville, any objections? BR, Jani. > > Thanks, > Vandita >> >> BR, >> Jani >> >> >> >> >> > >> > Stan >> > >> >> >> >> > >> >> > BR, >> >> > Jani. >> >> > >> >> > > Another thing is that probably enabling second VDSC is cheaper in >> >> > > terms of power consumption, than bumping up the CDCLK. >> >> > > >> >> > > Stan >> >> > > >> >> > >> > >> >> > >> > Okay , So you suggest that we set the cd clock to max when we >> have such requirement, than enabling the second engine? >> >> > >> >> >> > >> That seems like the easiest solution. Another option might be to >> >> > >> come up with some lower dotclock limit for the use of the second >> >> > >> vdsc. But not sure we know where the tipping point is wrt. powr >> consumption. >> >> > >> >> >> > >> -- >> >> > >> Ville Syrjälä >> >> > >> Intel >> >> > >> >> > -- >> >> > Jani Nikula, Intel Open Source Graphics Center >> >> -- >> Jani Nikula, Intel Open Source Graphics Center -- Jani Nikula, Intel Open Source Graphics Center