Patch "drm/i915/gt: Fix CCS id's calculation for CCS mode setting" has been added to the 6.9-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    drm/i915/gt: Fix CCS id's calculation for CCS mode setting

to the 6.9-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     drm-i915-gt-fix-ccs-id-s-calculation-for-ccs-mode-se.patch
and it can be found in the queue-6.9 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit f310e66aa3a8500712de6bc6f765d2be702fb388
Author: Andi Shyti <andi.shyti@xxxxxxxxxxxxxxx>
Date:   Fri May 17 11:06:16 2024 +0200

    drm/i915/gt: Fix CCS id's calculation for CCS mode setting
    
    [ Upstream commit ee01b6a386eaf9984b58a2476e8f531149679da9 ]
    
    The whole point of the previous fixes has been to change the CCS
    hardware configuration to generate only one stream available to
    the compute users. We did this by changing the info.engine_mask
    that is set during device probe, reset during the detection of
    the fused engines, and finally reset again when choosing the CCS
    mode.
    
    We can't use the engine_mask variable anymore, as with the
    current configuration, it imposes only one CCS no matter what the
    hardware configuration is.
    
    Before changing the engine_mask for the third time, save it and
    use it for calculating the CCS mode.
    
    After the previous changes, the user reported a performance drop
    to around 1/4. We have tested that the compute operations, with
    the current patch, have improved by the same factor.
    
    Fixes: 6db31251bb26 ("drm/i915/gt: Enable only one CCS for compute workload")
    Signed-off-by: Andi Shyti <andi.shyti@xxxxxxxxxxxxxxx>
    Cc: Chris Wilson <chris.p.wilson@xxxxxxxxxxxxxxx>
    Cc: Gnattu OC <gnattuoc@xxxxxx>
    Cc: Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx>
    Cc: Matt Roper <matthew.d.roper@xxxxxxxxx>
    Tested-by: Jian Ye <jian.ye@xxxxxxxxx>
    Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@xxxxxxxxx>
    Tested-by: Gnattu OC <gnattuoc@xxxxxx>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240517090616.242529-1-andi.shyti@xxxxxxxxxxxxxxx
    (cherry picked from commit a09d2327a9ba8e3f5be238bc1b7ca2809255b464)
    Signed-off-by: Jani Nikula <jani.nikula@xxxxxxxxx>
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 7a6dc371c384e..bc6209df0f680 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -919,6 +919,12 @@ static intel_engine_mask_t init_engine_mask(struct intel_gt *gt)
 	if (IS_DG2(gt->i915)) {
 		u8 first_ccs = __ffs(CCS_MASK(gt));
 
+		/*
+		 * Store the number of active cslices before
+		 * changing the CCS engine configuration
+		 */
+		gt->ccs.cslices = CCS_MASK(gt);
+
 		/* Mask off all the CCS engine */
 		info->engine_mask &= ~GENMASK(CCS3, CCS0);
 		/* Put back in the first CCS engine */
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.c b/drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.c
index 99b71bb7da0a6..3c62a44e9106c 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.c
@@ -19,7 +19,7 @@ unsigned int intel_gt_apply_ccs_mode(struct intel_gt *gt)
 
 	/* Build the value for the fixed CCS load balancing */
 	for (cslice = 0; cslice < I915_MAX_CCS; cslice++) {
-		if (CCS_MASK(gt) & BIT(cslice))
+		if (gt->ccs.cslices & BIT(cslice))
 			/*
 			 * If available, assign the cslice
 			 * to the first available engine...
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h
index def7dd0eb6f19..cfdd2ad5e9549 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
@@ -207,6 +207,14 @@ struct intel_gt {
 					    [MAX_ENGINE_INSTANCE + 1];
 	enum intel_submission_method submission_method;
 
+	struct {
+		/*
+		 * Mask of the non fused CCS slices
+		 * to be used for the load balancing
+		 */
+		intel_engine_mask_t cslices;
+	} ccs;
+
 	/*
 	 * Default address space (either GGTT or ppGTT depending on arch).
 	 *




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux