Re: [PATCH 2/2] drm/i915/dp: Use current cdclk for DSC Bigjoiner BW check

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 3/29/2023 7:14 PM, Lisovskiy, Stanislav wrote:
On Wed, Mar 29, 2023 at 02:35:38PM +0300, Ville Syrjälä wrote:
On Wed, Mar 29, 2023 at 05:00:55PM +0530, Nautiyal, Ankit K wrote:
On 3/29/2023 4:23 PM, Ville Syrjälä wrote:
On Wed, Mar 29, 2023 at 04:06:21PM +0530, Nautiyal, Ankit K wrote:
On 3/29/2023 3:27 PM, Ville Syrjälä wrote:
On Wed, Mar 29, 2023 at 02:14:49PM +0530, Ankit Nautiyal wrote:
As per Bspec, Big Joiner BW check is:
Output bpp <= PPC * CDCLK frequency * Big joiner interface bits /
Pixel clock

Currently we always use max_cdclk in the check for both modevalid
and compute config steps.

During modevalid use max_cdclk_freq for the check.
During compute config step use current cdclk for the check.
Nak. cdclk is computed much later based on what is actually needed.
The cdclk freq you are using here is essentially a random number.
Oh I didn't realise that, perhaps I was lucky when I tested this.

So this check where CDCLK is mentioned, actually expects max_cdclk_freq?

We use max_cdclk_freq basically as a "hack" to estimate what could be the max
amount of the CDCLK, because for the reasons, Ville mentioned, we can't use
CDCLK directly here, because it hasn't been yet calculated.

However we anyway know CDCLK will be aligned accordingly to pixel rate.

If it doesnt then, we might have a compressed_bpp value, that might be
violating the big joiner bw check.

Should this be handled while computing cdclk?
Yes. I suggest adding something like intel_vdsc_min_cdclk() that
handles all of it.

I can try that out.
It is all again about that same chicken&egg problem.
Our paradigm is that CDCLK is the last thing that we calculate, however that
check instructs us to choose the output bpp which obeys

Output bpp <= PPC * CDCLK frequency * Big joiner interface bits / pixel clock

rule.

If we choose to adjust CDCLK accordingly, we loose an option to actually change
the ourpur bpp to save the power, because theoretically we could avoid increasing
CDCLK to match that rule, by decreasing the output bpp..

So this kinda leads us to possibly waste more power.

I understand there is a tradeoff of power and performance, and what we are trying to do is to set the maximum compressed bpp (to have minimum compression).

But there is a possibility with bigjoiner case, where we set the compressed_bpp as per max_cdclk_freq, but the check fails as compressed_bpp is too high for the computed cdclk as per the Bigjoiner BW check.

This might not be faced every time bigjoiner + dsc are involved, as there are link_bw_check, and other dsc checks that might limit the compressed_bpp and the computed cdclk is sufficient.

However, there are cases where the bigjoiner check is actually failing, and resulting in pipe fifo underruns. I have seen in my testing with an 8k setup via PCON->HDMI2.1:

For 8k@30 the compressed_bpp is set to 21 which according to max_cdclk_freq is fine, but the with actual cdclock of 307200 KHz, the compressed_bpp of 21 bpp is getting too much, resulting is bigjoiner bw check failure, causing underruns and no-display.

For 8k@60 we do not see this issue, as the compressed_bpp is limited by link bandwidth check to 10bpp, and also the cdclock is computed is actually max_cdclk_freq.

For 4k@120 again we see compressed_bpp is set to 21, and cdclk computed to be max_cdclk_freq so the check doesnt fail.

So IMHO, we do need a check to avoid the issue mentioned above.

I think let the existing check in dp_get_output_bpp remain as it is, with max_cdclk_freq. This will help to get an acceptable compressed_bpp, and if other dsc constraint come into play we will get a lesser value for the bpp.

Let’s add a check in intel_crtc_compute_min_cdclk  : intel_vdsc_min_cdclk() (as suggested by Ville), where, if bigjoiner is used, update the min_cdclk to accommodate the compressed_bpp. In worst case we have to use max_cdclk_freq.

I have tested something like this with the above mentioned setup. The compressed_bpp is set as 21 as before, but the cdclck computed is 556800 KHz, which enough to honor the check, and is still less than the max cdclk of 652800 KHz.

If this approach makes sense, I can float the changes.

Regards,

Ankit



Stan

Will also add *Pipe BW check*: Pixel clock < PPC * CDCLK frequency *
Number of pipes joined, which seems to be missing.
That is already accounted for in the pixel rate.

So with pipe bw_check cdclk should be >  Pixel clock / (PPC * Number of
pipes joined)

In addition, as per bigjoiner check it should be >= compressed_bpp /
(PPC * bigjoiner interface bits).

Regards,

Ankit

Regards,

Ankit

Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@xxxxxxxxx>
---
    drivers/gpu/drm/i915/display/intel_dp.c     | 9 ++++++---
    drivers/gpu/drm/i915/display/intel_dp.h     | 1 +
    drivers/gpu/drm/i915/display/intel_dp_mst.c | 1 +
    3 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c
index 3fe651a8f5d0..d6600de1ab49 100644
--- a/drivers/gpu/drm/i915/display/intel_dp.c
+++ b/drivers/gpu/drm/i915/display/intel_dp.c
@@ -711,6 +711,7 @@ u32 intel_dp_dsc_nearest_valid_bpp(struct drm_i915_private *i915, u32 bpp, u32 p
    u16 intel_dp_dsc_get_output_bpp(struct drm_i915_private *i915,
    				u32 link_clock, u32 lane_count,
    				u32 mode_clock, u32 mode_hdisplay,
+				unsigned int cdclk,
    				bool bigjoiner,
    				u32 pipe_bpp,
    				u32 timeslots)
@@ -757,9 +758,9 @@ u16 intel_dp_dsc_get_output_bpp(struct drm_i915_private *i915,
if (bigjoiner) {
    		int bigjoiner_interface_bits = DISPLAY_VER(i915) <= 12 ? 24 : 36;
-		u32 max_bpp_bigjoiner =
-			i915->display.cdclk.max_cdclk_freq * 2 * bigjoiner_interface_bits /
-			intel_dp_mode_to_fec_clock(mode_clock);
+
+		u32 max_bpp_bigjoiner = cdclk * 2 * bigjoiner_interface_bits /
+					intel_dp_mode_to_fec_clock(mode_clock);
bits_per_pixel = min(bits_per_pixel, max_bpp_bigjoiner);
    	}
@@ -1073,6 +1074,7 @@ intel_dp_mode_valid(struct drm_connector *_connector,
    							    max_lanes,
    							    target_clock,
    							    mode->hdisplay,
+							    dev_priv->display.cdclk.max_cdclk_freq,
    							    bigjoiner,
    							    pipe_bpp, 64) >> 4;
    			dsc_slice_count =
@@ -1580,6 +1582,7 @@ int intel_dp_dsc_compute_config(struct intel_dp *intel_dp,
    							    pipe_config->lane_count,
    							    adjusted_mode->crtc_clock,
    							    adjusted_mode->crtc_hdisplay,
+							    dev_priv->display.cdclk.hw.cdclk,
    							    pipe_config->bigjoiner_pipes,
    							    pipe_bpp,
    							    timeslots);
diff --git a/drivers/gpu/drm/i915/display/intel_dp.h b/drivers/gpu/drm/i915/display/intel_dp.h
index ef39e4f7a329..d150bfe8abf4 100644
--- a/drivers/gpu/drm/i915/display/intel_dp.h
+++ b/drivers/gpu/drm/i915/display/intel_dp.h
@@ -106,6 +106,7 @@ int intel_dp_dsc_compute_bpp(struct intel_dp *intel_dp, u8 dsc_max_bpc);
    u16 intel_dp_dsc_get_output_bpp(struct drm_i915_private *i915,
    				u32 link_clock, u32 lane_count,
    				u32 mode_clock, u32 mode_hdisplay,
+				unsigned int cdclk,
    				bool bigjoiner,
    				u32 pipe_bpp,
    				u32 timeslots);
diff --git a/drivers/gpu/drm/i915/display/intel_dp_mst.c b/drivers/gpu/drm/i915/display/intel_dp_mst.c
index a860cbc5dbea..266e31b78729 100644
--- a/drivers/gpu/drm/i915/display/intel_dp_mst.c
+++ b/drivers/gpu/drm/i915/display/intel_dp_mst.c
@@ -925,6 +925,7 @@ intel_dp_mst_mode_valid_ctx(struct drm_connector *connector,
    							    max_lanes,
    							    target_clock,
    							    mode->hdisplay,
+							    dev_priv->display.cdclk.max_cdclk_freq,
    							    bigjoiner,
    							    pipe_bpp, 64) >> 4;
    			dsc_slice_count =
--
2.25.1
--
Ville Syrjälä
Intel



[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux