On 22/08/2018 18:07, Tvrtko Ursulin
wrote:
On 22/08/2018 17:33, Lionel Landwerlin wrote:
On 22/08/2018 17:18, Tvrtko Ursulin wrote:
From: Tvrtko Ursulin
<tvrtko.ursulin@xxxxxxxxx>
Bitfield width for configuring the active slice count has
grown in Gen11
so we need to program the GEN8_R_PWR_CLK_STATE accordingly.
Current code was always requesting eight times the number of
slices (due
writting to a bitfield starting three bits higher than it
should). These
requests were luckily a) capped by the hardware to the
available number of
slices, and b) we haven't yet exported the code to ask for
reduced slice
configurations.
Due both of the above there was no impact from this incorrect
programming
but we should still fix it.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
Bspec: 12247
Reported-by: tony.ye@xxxxxxxxx
Suggested-by: Lionel Landwerlin
<lionel.g.landwerlin@xxxxxxxxx>
Cc: Lionel Landwerlin <lionel.g.landwerlin@xxxxxxxxx>
Cc: tony.ye@xxxxxxxxx
---
drivers/gpu/drm/i915/i915_reg.h | 2 ++
drivers/gpu/drm/i915/intel_lrc.c | 12 ++++++++----
2 files changed, 10 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_reg.h
b/drivers/gpu/drm/i915/i915_reg.h
index 59d06d0055bb..640f7b774a26 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -344,6 +344,8 @@ static inline bool
i915_mmio_reg_valid(i915_reg_t reg)
#define GEN8_RPCS_S_CNT_ENABLE (1 << 18)
#define GEN8_RPCS_S_CNT_SHIFT 15
#define GEN8_RPCS_S_CNT_MASK (0x7 <<
GEN8_RPCS_S_CNT_SHIFT)
+#define GEN11_RPCS_S_CNT_SHIFT 12
+#define GEN11_RPCS_S_CNT_MASK (0x3f <<
GEN11_RPCS_S_CNT_SHIFT)
#define GEN8_RPCS_SS_CNT_ENABLE (1 << 11)
#define GEN8_RPCS_SS_CNT_SHIFT 8
#define GEN8_RPCS_SS_CNT_MASK (0x7 <<
GEN8_RPCS_SS_CNT_SHIFT)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c
b/drivers/gpu/drm/i915/intel_lrc.c
index 36050f085071..43b8b0675ba0 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -2501,10 +2501,14 @@ make_rpcs(struct drm_i915_private
*dev_priv)
* enablement.
*/
if (INTEL_INFO(dev_priv)->sseu.has_slice_pg) {
- rpcs |= GEN8_RPCS_S_CNT_ENABLE;
- rpcs |=
hweight8(INTEL_INFO(dev_priv)->sseu.slice_mask) <<
- GEN8_RPCS_S_CNT_SHIFT;
- rpcs |= GEN8_RPCS_ENABLE;
+ rpcs =
hweight8(INTEL_INFO(dev_priv)->sseu.slice_mask);
+
+ if (INTEL_GEN(dev_priv) >= 11)
+ rpcs <<= GEN11_RPCS_S_CNT_SHIFT;
+ else
+ rpcs <<= GEN8_RPCS_S_CNT_SHIFT;
+
+ rpcs |= GEN8_RPCS_ENABLE | GEN8_RPCS_S_CNT_ENABLE;
I don't know if you saw that wording in the documentation :
"
Note: In ICL, software programs this register as if GT consists
of 2 slices with 4 subslices in each slice. Hardware maps this
to the 1 slice/8-subslice physical layout.
"
My understanding is that it would make this function a bit more
complicated ;)
It also implies that on ICL you cannot select 3 subslices, which
is unfortunately what Tony was trying to do.
Maybe some opens need to be raised as to what's possible on ICL.
I interpreted the note in my head as "In ICL, _if_ _the_ software
programs.." so did not see a problem. Thought that would be just
some hidden/under the covers remapping hw would do. But I see now
that was wrong, and you are most likely right. I'll try to do some
digging to understand this better.
But for the second part of it, I don't see why 1x3 configuration
would be illegal. If software must assume hw is 2x4, even if in
reality it is 1x8, then 1x3 would still be legal, no?
We still seem to put a subslice number in the register for ICL
(values being 0b001, 0b010, 0b011 & 0b100).
My understanding is that the hardware will just multiply that
value by 2 to map to the 1x8 underlying topology.
So if that's the case, you can't really do odd numbers... ¯\_(ツ)_/¯
I thought the cause of the hang on ICL was that when Tony was
asking for 1x3, the code was actually programming a request for
8x3 - which is illegal (as in slice count must be 1 to enable
subslice pg) and so would fail to actually turn of the unwanted
subslices.
But then I also though on ICL we deal with masks and not counts
when programming the hardware. Since apparently it is counts both
for slices and subslices, I am mightily confused as to how
media-driver would even theoretically be able to turn off a
_specific_ (sub)slice?!
The fact that the feature needed isn't implemented at by the
thread dispatcher is really strange to me too.
It sounds like we're forced to use a bigger hammer than what we
really need.
As to how that maps to the right subslices is also unknown to me.
The only explanation I have is that subslices with no VME
samplers get turn off first in the list of subslices to turn off.
Cheers,
-
Lionel
Regards,
Tvrtko
}
if (INTEL_INFO(dev_priv)->sseu.has_subslice_pg) {
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
|
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx