Hi,
On Friday 04 November 2016 12:36 AM, Paulo Zanoni wrote:
Em Qui, 2016-10-13 às 16:28 +0530, Kumar, Mahesh escreveu:
This patch adds support to decode system memory bandwidth
which will be used for arbitrated display memory percentage
calculation in GEN9 based system.
Changes from v1:
- Address comments from Paulo
- implement decode function for SKL/KBL also
Again, my understanding of these registers is not good so I will ask
some questions that will help me properly understand and review the
code. I may also end up asking some questions that don't make sense.
Please correct me in case any of my assumptions is wrong.
Perhaps answers to my questions could become code comments since I
suppose most of the gfx team is not super familiar with the memory
regs, and we're going to have to maintain the code.
Signed-off-by: "Kumar, Mahesh" <mahesh1.kumar@xxxxxxxxx>
---
drivers/gpu/drm/i915/i915_drv.c | 170
++++++++++++++++++++++++++++++++++++++++
drivers/gpu/drm/i915/i915_drv.h | 14 ++++
drivers/gpu/drm/i915/i915_reg.h | 38 +++++++++
3 files changed, 222 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_drv.c
b/drivers/gpu/drm/i915/i915_drv.c
index 89d3222..b5f601c 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -979,6 +979,170 @@ static void intel_sanitize_options(struct
drm_i915_private *dev_priv)
DRM_DEBUG_DRIVER("use GPU sempahores? %s\n",
yesno(i915.semaphores));
}
+static inline void skl_memdev_info_fill(struct memdev_info *info,
uint32_t val)
+{
+ uint8_t channel_width, rank;
+
+ info->rank_valid = true;
+ channel_width = (val >> SKL_DRAM_CHANNEL_WIDTH_SHIFT) &
+ SKL_DRAM_CHANNEL_WIDTH_MASK;
Why are we looking at the L bits instead of the S bits? What's the
difference between them?
We need to see both L & S bits. This happen due to incomplete
interpretation/explanation of register.
There can be two dimm per channel, & each nibble tells the information
about L/S dimm
+ if (channel_width == SKL_DRAM_WIDTH_1)
+ info->channel_width_bytes = 1;
+ else if (channel_width == SKL_DRAM_WIDTH_2)
+ info->channel_width_bytes = 2;
+ else
+ info->channel_width_bytes = 4;
This is not checking for the invalid value of 0x3. Perhaps a switch()
instead of an if() ladder would be better here.
+
+ rank = (val >> SKL_DRAM_RANK_SHIFT) & SKL_DRAM_RANK_MASK;
+ if (rank == SKL_DRAM_RANK_SINGLE)
+ info->rank = DRAM_RANK_SINGLE;
+ else
+ info->rank = DRAM_RANK_DUAL;
+}
+
+static int
+skl_get_memdev_info(struct drm_i915_private *dev_priv)
+{
+ uint32_t val = 0;
+ uint32_t mem_speed = 0;
+ struct memdev_info *memdev_info = &dev_priv->memdev_info;
+
+ val = I915_READ(SKL_MC_BIOS_DATA_0_0_0_MCHBAR_PCU);
+ mem_speed = div_u64((uint64_t) (val & SKL_REQ_DATA_MASK) *
+ SKL_MEMORY_FREQ_MULTIPLIER, 1000);
But but but isn't it possible to have more than 2GHz in SKL? I think I
have a big SKL machine with more than 2GHz here... I'll check it later.
Also, don't we need to check FREQ_TYPE and ODD_RATIO?
It seems it's possible,
here we have to multiply frequency by 2 again to get the actual freq. So
we can have upto 4GHz. current MAX is 2133MHz
This was not mentioned in the CSpec, but got confirmation from HW team.
Some digging suggests that maybe we need to figure out somehow if we're
using DCLK or QCLK and multiply accordingly (based on the BIOS_REQ
register instead of BIOS_DATA).
I'd really need some clarification on how all of this works.
Also, it looks like there's no need for this to be a 64bit calculation
due to the mask limit.
+
+ if (mem_speed == 0)
+ return -EINVAL;
+
+ memdev_info->valid = true;
+ memdev_info->mem_speed_khz = mem_speed;
+ memdev_info->num_channels = 0;
+
+ memdev_info->rank_valid = false;
+ val = I915_READ(SKL_MAD_DIMM_CH0_0_0_0_MCHBAR_MCMAIN);
+
+ if (val != 0xFFFFFFFF) {
+ memdev_info->num_channels++;
+ skl_memdev_info_fill(memdev_info, val);
+ }
+
+ val = I915_READ(SKL_MAD_DIMM_CH1_0_0_0_MCHBAR_MCMAIN);
+
+ if (val != 0xFFFFFFFF) {
+ memdev_info->num_channels++;
+ if (!memdev_info->rank_valid)
+ skl_memdev_info_fill(memdev_info, val);
+ }
A simple iteration over the two register addresses would have saved
some code :).
Also, isn't it possible to have channels with different width/rank?
Yes, channel can have different RANK, assumption made here is wrong.
need to rewrite the code according to new information got.
Shouldn't our code be sanity-checking this? Perhaps grab the width/rank
for each channel and compare. When the specs are unclear or don't
exist, it's probably better if our code is able to check/validate its
own assumptions.
+
+ if (memdev_info->num_channels == 0)
+ memdev_info->valid = false;
+ return 0;
+}
+
+static int
+bxt_get_memdev_info(struct drm_i915_private *dev_priv)
+{
+ struct memdev_info *memdev_info = &dev_priv->memdev_info;
+ uint32_t dram_channel;
+ uint32_t mem_speed, val;
+ uint8_t num_channel, dram_type;
+ int i;
+
+ val = I915_READ(BXT_P_CR_MC_BIOS_REQ_0_0_0);
+ mem_speed = div_u64((uint64_t) (val & BXT_REQ_DATA_MASK) *
+ SKL_MEMORY_FREQ_MULTIPLIER, 1000);
AFAIU, there's no need for 64 bit division since the maximum value
is 8399790.
Will remove 64 bit operation
+
+ if (mem_speed == 0)
+ return -EINVAL;
+
+ memdev_info->valid = true;
+ memdev_info->mem_speed_khz = mem_speed;
+ dram_type = (val >> BXT_DRAM_TYPE_SHIFT) &
BXT_DRAM_TYPE_MASK;
+ dram_channel = (val >> BXT_DRAM_CHANNEL_SHIFT) &
BXT_DRAM_CHANNEL_MASK;
+ num_channel = hweight32(dram_channel);
+
+ /*
+ * The lpddr3 and lpddr4 technologies can have 1-4 channels
and the
+ * channels are 32bits wide; while ddr3l technologies can
have 1-2
+ * channels and the channels are 64 bits wide. In case of
single 64 bit
+ * wide DDR3L dimm sets two channel-active bits in
+ * P_CR_MC_BIOS_REQ_0_0_0 register and system with two DDR3L
64bit dimm
+ * will set all four channel-active bits in above register.
+ * In order to get actual number of channels in DDR3L DRAM
we need to
+ * device total channel-active bits set by 2.
+ */
+
+ switch (dram_type) {
+ case BXT_DRAM_TYPE_LPDDR3:
+ case BXT_DRAM_TYPE_LPDDR4:
+ memdev_info->channel_width_bytes = 4;
+ memdev_info->num_channels = num_channel;
+ break;
+ case BXT_DRAM_TYPE_DDR3L:
+ memdev_info->channel_width_bytes = 8;
+ memdev_info->num_channels = num_channel / 2;
+ break;
+ default:
+ DRM_DEBUG_KMS("Unknown DRAM type\n");
+ memdev_info->valid = false;
+ return -EINVAL;
+ }
Shouldn't channel_width_bytes just be num_channels * 4 for LPDDR3/4 and
num_channels * 8 for DDR3L?
Or just multiply by 4 before we do the "num_channel / 2"... This would
avoid even having to find out the DRAM type (in case it's not needed
anywhere else).
DRAM type is not needed anywhere else, but it's require to correctly
find the number of channels,
which will be needed while deciding on WA.
+
+ /*
+ * Now read each DUNIT8/9/10/11 to check the rank of each
dimms.
+ * all the dimms should have same rank as in first valid
Dimm
+ */
I think we need to explain in the comment why we're iterating over the
4 dunits but just looking at the first one that's not 0xFFFFFFFF.
As told above, assumption made that all channel will have same RANK is
wrong, need to rework.
Isn't there a way to discover which dunits to look at based on the
number of channels and DRAM type?
In case of DDR3L dunit-9/10 only can be set as per number of channels.
Don't you think it'll add more complexity in the code?
Also, does the valid dunit amount reflect the number of channels like
in SKL?
Yes, number of channel will be equal to DUNIT set.
Oh, using this we can re-factor the code & calculate num of channels
same way as in SKL & always assume
channel_width_bytes * channel as "4 * active_bits set above",
even instead of storing freq & channel width separately, we can store
system bandwidth itself. This will avoid same "system bandwidth"
calculation in each flip.
+ memdev_info->rank_valid = false;
+ for (i = 0; i < BXT_D_CR_DRP0_DUNIT_MAX; i++) {
+ val = I915_READ(BXT_D_CR_DRP0_DUNIT(i));
+ if (val != 0xFFFFFFFF) {
+ uint8_t rank;
+
+ memdev_info->rank_valid = true;
+ rank = val & BXT_DRAM_RANK_MASK;
+ if (rank == BXT_DRAM_RANK_SINGLE)
+ memdev_info->rank =
DRAM_RANK_SINGLE;
+ else if (rank == BXT_DRAM_RANK_DUAL)
+ memdev_info->rank = DRAM_RANK_DUAL;
+ else
+ memdev_info->rank = DRAM_RANK_NONE;
Probably DRM_DEBUG_KMS("something\n") here? Also maybe set rank_valid
back to false? Return non-zero?
+
+ break;
+ }
+ }
+ return 0;
+}
+
+static void
+intel_get_memdev_info(struct drm_i915_private *dev_priv)
+{
+ struct memdev_info *memdev_info = &dev_priv->memdev_info;
+ int ret;
+
+ memdev_info->valid = false;
+ if (!IS_GEN9(dev_priv))
+ return;
+
+ if (IS_BROXTON(dev_priv))
+ ret = bxt_get_memdev_info(dev_priv);
+ else
+ ret = skl_get_memdev_info(dev_priv);
+ if (ret)
+ return;
What I find a little hard to read is that we set memdev_info->valid
both here and in the functions we call. Due to the early returns it's
not super simple to see the possible value it may get, and things get
even more complicated due to the fact that we may return 0 from
skl_get_memdev_info() while still setting memdev_info->valid to false.
Things would be much simpler and easier to read if we only set
memdev_info->valid in this function, based on the return value of the
get_memdev_info() functions. And then the get_memdev_info() functions
would appropriately return zero/nonzero based on valid/invalid
configuration.
+
+ if (memdev_info->valid) {
+ DRM_DEBUG_DRIVER("DRAM speed-%d Khz total-channels-
%d channel-width-%d bytes\n",
+ memdev_info->mem_speed_khz,
+ memdev_info->num_channels,
+ memdev_info->channel_width_bytes);
As I mentioned in my previous review, please use the appropriate
specifiers here instead of %d for everybody.
Also, we usually print "variable: value, other variable: value" instead
of "variable-value other-variable-value". Please use the established
convention. Reading "total-channels-1" is less clear than "total
channels: 1" and doesn't raise ambiguity concerns with negative values.
+ if (memdev_info->rank_valid)
+ DRM_DEBUG_DRIVER("DRAM rank-%s\n",
+ (memdev_info->rank == DRAM_RANK_DUAL) ?
+ "dual" : "single");
But there's also DRAM_RANK_NONE.
+ }
+}
+
+
/**
* i915_driver_init_hw - setup state requiring device access
* @dev_priv: device private
@@ -1082,6 +1246,12 @@ static int i915_driver_init_hw(struct
drm_i915_private *dev_priv)
DRM_DEBUG_DRIVER("can't enable MSI");
}
+ /*
+ * Fill the memdev structure to get the system raw bandwidth
+ * This will be used by WM algorithm, to implement GEN9
based WA
+ */
+ intel_get_memdev_info(dev_priv);
+
return 0;
out_ggtt:
diff --git a/drivers/gpu/drm/i915/i915_drv.h
b/drivers/gpu/drm/i915/i915_drv.h
index a219a35..adbd9aa 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2057,6 +2057,20 @@ struct drm_i915_private {
bool distrust_bios_wm;
} wm;
+ struct memdev_info {
+ bool valid;
+ uint32_t mem_speed_khz;
+ uint8_t channel_width_bytes;
+ uint8_t num_channels;
+ bool rank_valid;
+ enum {
+ DRAM_RANK_NONE = 0,
+ DRAM_RANK_SINGLE,
+ DRAM_RANK_DUAL
+ } rank;
What's the meaning of DRAM_RANK_NONE? Does DRAM_RANK_NONE imply
rank_valid=false? What's the meaning of rank=DRAM_RANK_NONE when
rank_valid=true?
Perhaps we could get rid of the rank_valid variable? Maybe rename
DRAM_RANK_NONE to DRAM_RANK_INVALID?
Agree, will make change to use DRAM_RANK_INVALID instead of rank_valid
variable :)
+ } memdev_info;
+
+
struct i915_runtime_pm pm;
/* Abstract the submission mechanism (legacy ringbuffer or
execlists) away */
diff --git a/drivers/gpu/drm/i915/i915_reg.h
b/drivers/gpu/drm/i915/i915_reg.h
index acc767a..a9c467c 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -7721,6 +7721,44 @@ enum {
#define DC_STATE_DEBUG_MASK_CORES (1<<0)
#define DC_STATE_DEBUG_MASK_MEMORY_UP (1<<1)
+#define _MMIO_MCHBAR_PIPE(x, a, b) _MMIO(MCHBAR_MIRROR_BASE_S
NB + _PIPE(x, a, b))
Wait, what? I may have misunderstood something, but AFAICS the
registers that use this macro are not per-pipe, so using the _PIPE
macro makes things super confusing. In this case it should be fine to
just do something a little more hardcoded, like:
#define BXT_D_CR_DRP0_DUNIT(unit) _MMIO(MCHBAR_MIRROR_BASE_SNB + (unit)
* 0x200)
I was trying to use already available MACRO, will make changes as
suggested by you.
+#define BXT_P_CR_MC_BIOS_REQ_0_0_0 _MMIO(MCHBAR_MIRROR_BASE_S
NB + 0x7114)
+#define BXT_REQ_DATA_MASK (0x3F << 0)
Or just 0x3F to be consistent with the other definitions.
+#define BXT_DRAM_TYPE_SHIFT 24
+#define BXT_DRAM_TYPE_MASK 0x7
The spec you sent me lists these bits as reserved. Are you sure these
come from this register?
bits 26:24 tells the DRAM_TYPE information in
P_CR_MC_BIOS_REQ_0_0_0_MCHBAR register
thanks for review,
Regards,
-Mahesh
+#define BXT_DRAM_CHANNEL_SHIFT 12
+#define BXT_DRAM_CHANNEL_MASK 0xF
I'd rename these to BXT_DRAM_CHANNEL_ACTIVE_{SHIFT,MASK}.
+
+#define BXT_DRAM_TYPE_LPDDR3 0x1
+#define BXT_DRAM_TYPE_LPDDR4 0x2
+#define BXT_DRAM_TYPE_DDR3L 0x4
+/*
+ * BIOS programs this field of REQ_DATA [5:0] in integer
+ * multiple of 133330 KHz (133.33MHz)
+ */
+#define SKL_MEMORY_FREQ_MULTIPLIER 0x208D2
There's a bad tab here that should be a space.
Also, it makes a looooot more sense to use the decimal value instead of
0x208D2.
+#define BXT_D_CR_DRP0_DUNIT8 0x1000
+#define BXT_D_CR_DRP0_DUNIT9 0x1200
+#define BXT_D_CR_DRP0_DUNIT_MAX 4
+#define BXT_D_CR_DRP0_DUNIT(x) _MMIO_MCHBAR_PIPE(x,
BXT_D_CR_DRP0_DUNIT8, BXT_D_CR_DRP0_DUNIT9)
+#define BXT_DRAM_RANK_MASK 0x3
+#define BXT_DRAM_RANK_SINGLE 0x1
+#define BXT_DRAM_RANK_DUAL 0x3
+
+#define SKL_MC_BIOS_DATA_0_0_0_MCHBAR_PCU _MMIO(MCHBAR_MIRROR
_BASE_SNB + 0x5E04)
Why does BXT look at BIOS_REQ while SKL looks at BIOS_DATA? Shouldn't
SKL also look at BIOS_DATA or BXT look at BIOS_REQ? What's the big
difference? Also, that comment mentioning that we may sometimes get
zero is a little worrying: shouldn't we make sure we always read when
it's non-zero?
Thanks for the patch, and sorry for the huge amount of questions here:
learning while doing the review is sometimes tricky.
+#define SKL_REQ_DATA_MASK (0xF << 0)
+#define SKL_MAD_DIMM_CH0_0_0_0_MCHBAR_MCMAIN _MMIO(MCHBAR_MIR
ROR_BASE_SNB + 0x500C)
+#define SKL_MAD_DIMM_CH1_0_0_0_MCHBAR_MCMAIN _MMIO(MCHBAR_MIR
ROR_BASE_SNB + 0x5010)
+#define SKL_DRAM_CHANNEL_WIDTH_MASK 0x3
+#define SKL_DRAM_CHANNEL_WIDTH_SHIFT 8
+#define SKL_DRAM_WIDTH_1 0x0
+#define SKL_DRAM_WIDTH_2 0x1
+#define SKL_DRAM_WIDTH_4 0x2
+#define SKL_DRAM_RANK_MASK 0x1
+#define SKL_DRAM_RANK_SHIFT 10
+#define SKL_DRAM_RANK_SINGLE 0x0
+#define SKL_DRAM_RANK_DUAL 0x1
+
/* Please see hsw_read_dcomp() and hsw_write_dcomp() before using
this register,
* since on HSW we can't write to it using I915_WRITE. */
#define D_COMP_HSW _MMIO(MCHBAR_MIRROR_BASE_S
NB + 0x5F0C)
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx