Hi Matt, first of all thanks a lot for the observations you are raising. On Wed, Feb 21, 2024 at 12:51:04PM -0800, Matt Roper wrote: > On Wed, Feb 21, 2024 at 01:12:18AM +0100, Andi Shyti wrote: > > On Tue, Feb 20, 2024 at 03:39:18PM -0800, Matt Roper wrote: > > > On Tue, Feb 20, 2024 at 03:35:26PM +0100, Andi Shyti wrote: ... > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c > > > > index a425db5ed3a2..e19df4ef47f6 100644 > > > > --- a/drivers/gpu/drm/i915/gt/intel_gt.c > > > > +++ b/drivers/gpu/drm/i915/gt/intel_gt.c > > > > @@ -168,6 +168,14 @@ static void init_unused_rings(struct intel_gt *gt) > > > > } > > > > } > > > > > > > > +static void intel_gt_apply_ccs_mode(struct intel_gt *gt) > > > > +{ > > > > + if (!IS_DG2(gt->i915)) > > > > + return; > > > > + > > > > + intel_uncore_write(gt->uncore, XEHP_CCS_MODE, 0); > > > > > > This doesn't look right to me. A value of 0 means every cslice gets > > > associated with CCS0. > > > > Yes, that's what I'm trying to do. The behavior I'm looking for > > is this one: > > > > /* > > ... > > * With 1 engine (ccs0): > > * slice 0, 1, 2, 3: ccs0 > > * > > * With 2 engines (ccs0, ccs1): > > * slice 0, 2: ccs0 > > * slice 1, 3: ccs1 > > * > > * With 4 engines (ccs0, ccs1, ccs2, ccs3): > > * slice 0: ccs0 > > * slice 1: ccs1 > > * slice 2: ccs2 > > * slice 3: ccs3 > > ... > > */ > > > > where the user can configure runtime the mode, making sure that > > no client is connected to i915. > > > > But, this needs to be written > > > > As we are now forcing mode '1', then all cslices are connected > > with ccs0. > > Right --- and that's what I'm pointing out as illegal. I think that > code comment above was taken out of context from a different RFC series; > that's not an accurate description of the behavior we want here. > > First, the above comment is using ccs# to refer to userspace engines, > not hardware engines. As a simple example, DG2-G11 only ever has a > single CCS which userspace sees as "instance 0" but which is actually > CCS1 at the hardware level. If you try to follow the comment above when > programming CCS_MODE, you've assigned all of the cslices to a > non-existent engine and assigned no cslices to the CCS engine that > actually exists. For DG2-G10 (and I think DG2-G12), there are different > combinations of fused-off / not-fused-off engines that will always show > up in userspace as CCS0-CCSn, even if those don't match the hardware > IDs. > > Second, the above comment is assuming that you have a part with a > maximum fusing config (i.e., all cslices present). Using DG2-G11 again > as an example, there's also only a single cslice (cslice1), so if you > tell CCS1 that it's allowed to use EUs from non-existent cslice0, > cslice2, and cslice3, you might not get the behavior you were hoping > for. if the hardware slices are fused off we wouldn't see them in a first place, right? And that's anyway a permanent configuration that wouldn't affect the patch. BTW, is there any machine I can test this scenario? > > > On a DG2-G11 platform, that will flat out break > > > compute since CCS0 is never present (G11 only has a single CCS and it's > > > always the hardware's CCS1). Even on a G10 or G12 this could also break > > > things depending on the fusing of your card if the hardware CCS0 happens > > > to be missing. > > > > > > Also, the register says that we need a field value of 0x7 for each > > > cslice that's fused off. By passing 0, we're telling the CCS engine > > > that it can use cslices that may not actually exist. > > > > does it? Or do I need to write the id (0x0-0x3) of the user > > engine? That's how the mode is calculated in this algorithm. > > 0x0 - 0x3 are how you specify that a specific CCS engine can use the > cslice. If the cslice can't be used at all (i.e., it's fused off), then > you need to program a 0x7 to ensure no engines try to use the > non-existent DSS/EUs. I planned to limit this to the only DG2 (and ATSM, of course). Do you think it would it help? Andi